help removing disk duplicate

cheeki-breeki

New Member
Oct 20, 2023
18
0
1
"ls -l /dev/disk/by-id" shows a duplicate on a wiped disk:
(disk id)
(disk id)_1

as a result, when i do "sgdisk /dev/disk/by-id/existingdrive -R /dev/disk/by-id/newdrive"
a duplicate set is created:
(disk id)-part1 -> ../../sdf1
(disk id)-part2 -> ../../sdf2
(disk id)-part3 -> ../../sdf3
(disk id)_1-part1 -> ../../sdf1
(disk id)_1-part2 -> ../../sdf2
(disk id)_1-part3 -> ../../sdf3
 
as a result, when i do "sgdisk /dev/disk/by-id/existingdrive -R /dev/disk/by-id/newdrive"
a duplicate set is created:
(disk id)-part1 -> ../../sdf1
(disk id)-part2 -> ../../sdf2
(disk id)-part3 -> ../../sdf3
(disk id)_1-part1 -> ../../sdf1
(disk id)_1-part2 -> ../../sdf2
(disk id)_1-part3 -> ../../sdf3
That is what I would expect. As the documentatio of sgdisk states:
Replicate the main device's partition table on the specified second device. Note that the replicated partition table is an exact copy, including all GUIDs; if the device should have its own unique GUIDs, you should use the -G option on the new disk.
This is not Proxmox specific and other Linux documentation and guides might give additional help. EDIT: Maybe use -G instead or randomize the partition GUIDs afterwards?
 
Last edited:
IDW exactly how to correct the duplicate situation (probably depends on what caused it) - but have you tried rebooting the system without the disk attached & then reattaching?
 
That is what I would expect. As the documentatio of sgdisk states:

This is not Proxmox specific and other Linux documentation and guides might give additional help. EDIT: Maybe use -G instead or randomize the partition GUIDs afterwards?

yes, the problem is not sdisk, it was just an example..
somehow the second instance on the disk was created, and i can't remove it.

IDW exactly how to correct the duplicate situation (probably depends on what caused it) - but have you tried rebooting the system without the disk attached & then reattaching?
that would be a hasle. i suspect it might be a ZFS thing.

maybe i need to use pvesm free or pvesm remove?
 
Last edited:
Remove the unwanted partitions on the drive with gdisk (or GParted Live)? Sometimes a reboot or partprobe command is necessary to inform the kernel of partition changes.

it appears to not be a partition.
"fdisk -l" shows none.
i've used:
wipefs -a "disk"
sudo sfdisk --delete "-disk"
and i've wiped it in the proxmox web gui.

"partprobe" did nothing, which i'd expect after having rebooted many times.
 
it appears to not be a partition.
"fdisk -l" shows none.
i've used:
wipefs -a "disk"
sudo sfdisk --delete "-disk"
and i've wiped it in the proxmox web gui.

"partprobe" did nothing, which i'd expect after having rebooted many times.
Is this a standard Proxmox installation or did you add udev rules or run some scripts from the internet? What is the current output of ls -alh /dev/disk/by-id and lsblk (in CODE-tags please)?
 
that would be a hasle
The length of this forum issue/thread is also a hassle for you & for others trying to help!

i suspect it might be a ZFS thing.
Care to elaborate?

maybe i need to use pvesm free or pvesm remove?
What makes you think this disk exists in the Proxmox storage backend?


You can't expect help if you don't provide the min. adequate info of at least the history of that drive, what changed & what is happening now.
 
Is this a standard Proxmox installation or did you add udev rules or run some scripts from the internet? What is the current output of ls -alh /dev/disk/by-id and lsblk (in CODE-tags please)?

it ought to be standart. installed from PvE iso.

ls -alh /dev/disk/by-id:

Code:
nvme-eui.(id1)-> ../../nvme0n1
nvme-(brand)_SSD_(model)_(id2) -> ../../nvme0n1
nvme-(brand)_SSD_(model)_(id2)_1 -> ../../nvme0n1

lsblk:
nvme0n1 259:0 0 1.8T 0 disk


The length of this forum issue/thread is also a hassle for you & for others trying to help!


Care to elaborate?


What makes you think this disk exists in the Proxmox storage backend?


You can't expect help if you don't provide the min. adequate info of at least the history of that drive, what changed & what is happening now.
1) sry. i assumed he meant phycically removing the drive.

2) i don't know. did some operations, trying to replace boot disk. not sure which caused it

3) see 2)

4) had a zfs pool on it. wiped it and did the mentioned sgdisk commands
 
Last edited:
Can you remove it with rm /dev/disk/by-id/nvme-(brand)_SSD_(model)_(id2)_1 ? If it returns, can you check which udev rules triggered for this path?

yeah, it has disappeared from ls.

can i expect there are references somewhere that needs to be removed?

how do i check udev rules?
 
Last edited:
If I understand you correctly, there might be a fundamental misunderstanding:

The entire structure below /dev/disk/by-* consists of only symlinks to physical devices, which are dynamically generated by udev based on hardware data (i.e. mostly for convencience). For any sgdisk -R command (to clone partition tables to another device), while of course you can use any by-id device path, sgdisk will "just" follow the symlink to the actual device and use that. Because of the changes on the target device, udev will then also generate symlinks for that. So what you are seeing in /dev/disk/by-* is merely the result but not the cause.

"ls -l /dev/disk/by-id" shows a duplicate on a wiped disk:
(disk id)
(disk id)_1
This is entirely intentional for NVMe devices, and cannot be called a duplicate: It is merely an additional symlink to the exact same device/partition under an alternative name. The reason is NVMe's concept of namespaces where "1" is the default-existing namespace, i.e. udev creates symlinks for both the default (=unnamed) and for the explicit "1" namespace. If your device would have multiple namespaces configured, you would see additional symlinks with "(disk id)_2" etc. You can check these with "nvme list" and other sub-commands of the nvme-cli package.

If you clone the partition table of one disk to another, the generated partition symlinks for the target will depend on that device's namespace(s) because namespaces and partitions are different concepts (but in your case both devices seem to have only the default "1" namespace which is perfectly fine).

So what you are seeing is simply not a problem. ;)

Regards
 
Last edited:
rm /dev/disk/by-id/nvme-\("TAB-key"
what do you mean?

If I understand you correctly, there might be a fundamental misunderstanding:

The entire structure below /dev/disk/by-* consists of only symlinks to physical devices, which are dynamically generated by udev based on hardware data (i.e. mostly for convencience). For any sgdisk -R command (to clone partition tables to another device), while of course you can use any by-id device path, sgdisk will "just" follow the symlink to the actual device and use that. Because of the changes on the target device, udev will then also generate symlinks for that. So what you are seeing in /dev/disk/by-* is merely the result but not the cause.


This is entirely intentional for NVMe devices, and cannot be called a duplicate: It is merely an additional symlink to the exact same device/partition under an alternative name. The reason is NVMe's concept of namespaces where "1" is the default-existing namespace, i.e. udev creates symlinks for both the default (=unnamed) and for the explicit "1" namespace. If your device would have multiple namespaces configured, you would see additional symlinks with "(disk id)_2" etc. You can check these with "nvme list" and other sub-commands of the nvme-cli package.

If you clone the partition table of one disk to another, the generated partition symlinks for the target will depend on that device's namespace(s) because namespaces and partitions are different concepts (but in your case both devices seem to have only the default "1" namespace which is perfectly fine).

So what you are seeing is simply not a problem. ;)

Regards
thank you for the elaborate answer.

doing partprobe brought back the _1 item.
which of them should i then use for sgdisk?

the problem is that on the root disk, there's part1, part2, part3
using sgdisk -R on the target disk creates part1, part2, part3, but also duplicates part1_1, part2_1, part3_1
seems bloated and redundant.

btw. allow me to ask something a bit off-topic; in replacing boot disk, are these commands adequete?:
Code:
sgdisk "old-disk" -R "new-disk" -b=backup
sgdisk -G "new-disk"
lsblk "old-disk" "new-disk"
zpool attach rpool "old-disk" "new-disk"
zpool status    (wait for resilver to complete)
proxmox-boot-tool status    <new disk's ESP>
proxmox-boot-tool format "new-disk"
proxmox-boot-tool init "new-disk" <new disk's ESP>
proxmox-boot-tool refresh
zpool detach rpool "old-disk"
 
Last edited:
doing partprobe brought back the _1 item.
which of them should i then use for sgdisk?

the problem is that on the root disk, there's part1, part2, part3
using sgdisk -R on the target disk creates part1, part2, part3, but also duplicates part1_1, part2_1, part3_1
seems bloated and redundant.
It really does not matter as they are pointing to the exact same devices. Example from a 2-disk NVMe system:

Code:
$ ls -l /dev/disk/by-id/nvme*
/dev/disk/by-id/nvme-KINGSTON_SA2000M8250G_50026B7683B8E848 -> ../../nvme0n1
/dev/disk/by-id/nvme-KINGSTON_SA2000M8250G_50026B7683B8E848-part1 -> ../../nvme0n1p1
/dev/disk/by-id/nvme-KINGSTON_SA2000M8250G_50026B7683B8E848-part2 -> ../../nvme0n1p2
/dev/disk/by-id/nvme-KINGSTON_SA2000M8250G_50026B7683B8E848-part3 -> ../../nvme0n1p3
/dev/disk/by-id/nvme-KINGSTON_SA2000M8250G_50026B7683B8E848-part4 -> ../../nvme0n1p4
/dev/disk/by-id/nvme-KINGSTON_SA2000M8250G_50026B7683B8E848_1 -> ../../nvme0n1
/dev/disk/by-id/nvme-KINGSTON_SA2000M8250G_50026B7683B8E848_1-part1 -> ../../nvme0n1p1
/dev/disk/by-id/nvme-KINGSTON_SA2000M8250G_50026B7683B8E848_1-part2 -> ../../nvme0n1p2
/dev/disk/by-id/nvme-KINGSTON_SA2000M8250G_50026B7683B8E848_1-part3 -> ../../nvme0n1p3
/dev/disk/by-id/nvme-KINGSTON_SA2000M8250G_50026B7683B8E848_1-part4 -> ../../nvme0n1p4
/dev/disk/by-id/nvme-PNY_CS3030_250GB_SSD_PNY09200003790100411 -> ../../nvme1n1
/dev/disk/by-id/nvme-PNY_CS3030_250GB_SSD_PNY09200003790100411-part1 -> ../../nvme1n1p1
/dev/disk/by-id/nvme-PNY_CS3030_250GB_SSD_PNY09200003790100411-part2 -> ../../nvme1n1p2
/dev/disk/by-id/nvme-PNY_CS3030_250GB_SSD_PNY09200003790100411-part3 -> ../../nvme1n1p3
/dev/disk/by-id/nvme-PNY_CS3030_250GB_SSD_PNY09200003790100411-part4 -> ../../nvme1n1p4
/dev/disk/by-id/nvme-PNY_CS3030_250GB_SSD_PNY09200003790100411_1 -> ../../nvme1n1
/dev/disk/by-id/nvme-PNY_CS3030_250GB_SSD_PNY09200003790100411_1-part1 -> ../../nvme1n1p1
/dev/disk/by-id/nvme-PNY_CS3030_250GB_SSD_PNY09200003790100411_1-part2 -> ../../nvme1n1p2
/dev/disk/by-id/nvme-PNY_CS3030_250GB_SSD_PNY09200003790100411_1-part3 -> ../../nvme1n1p3
/dev/disk/by-id/nvme-PNY_CS3030_250GB_SSD_PNY09200003790100411_1-part4 -> ../../nvme1n1p4

Although there are "redundant" symlinks, in the end they point to only 2 physical devices and their partitions, /dev/nvme0n1 and /dev/nvme1n1.

btw. allow me to ask something a bit off-topic; in replacing boot disk, are these commands adequete?:
Code:
sgdisk "old-disk" -R "new-disk" -b=backup
sgdisk -G "new-disk"
lsblk "old-disk" "new-disk"
zpool attach rpool "old-disk" "new-disk"
zpool status    (wait for resilver to complete)
proxmox-boot-tool status    <new disk's ESP>
proxmox-boot-tool format "new-disk"
proxmox-boot-tool init "new-disk" <new disk's ESP>
proxmox-boot-tool refresh
zpool detach rpool "old-disk"
Mostly. zpool attach rpool "old-disk" "new-disk" should be zpool attach rpool "old-disk's_zfs_partition" "new-disk's_zfs_partition" though, not the entire disks. And consequentially the new-disk's ESP for all proxmox-boot-tool commands.

Be aware, while my previous comment on "it's all symlinks, pointing to the same devices, so it does not matter" still stands, zpool attach will recognize the fact when you point to devices via their by-id path (or any other, like by-partuuid, by-partlabel, etc) and use those stable identifiers, so you typically WANT that behaviour and AVOID using something like zpool attach rpool /dev/nvme0n1.

Regards
 
Hi folks,

not same issue but related to the thread title:

I moved a drive from a vm to another storage and forgot to mark "remove the original". Now when triyng to delete the original drive through the WebUI it complaints about the fact that a vm with the same id still exists.

What is the correct way to delete that duplicated drive?
 
The original disk is no longer attached nor present in the vm config., just the copied drive.
Your reply is slightly ambiguous. Have you solved the issue or not? If it is solved, ignore the rest of this post:

Interesting....

Maybe try in node console: qm rescan --vmid <VMID> (eg; if the VMID was 123, you would enter qm rescan --vmid 123).

Could you also show the output for qm config <VMID>

Also what is the full name of the (undeletable) image file?
 
  • Like
Reactions: Txalamar
Hello again

Sorry about that. No, I left the VM in that state.

That did the trick sir! :)

Code:
root@pve1:~# qm rescan --vmid 124
rescan volumes...
VM 124 add unreferenced volume 'NFS-NAS1:124/vm-124-disk-1.qcow2' as 'unused0' to config

There was no trace of the disk in the config file before the rescan. Now I can follow your earlier instructions and delete the disk directly from the VM hardware view.

Could this be a bug related to not performing a rescan after copying a disk through the WebUI? I'll give it a try via CLI to see if the behavior is the same.
Pvesm uses "copy" and "move" as subcommands, while qm uses the "--delete" option. I assume the WebUI also relies on pvesm internally.

I usually tick "delete source disk" when relocating VMs. In the past, I forgot to tick it a few times, and I always worked around the issue by manually deleting the corresponding folder in the storage images directory.
This time I was looking for a more user-friendly, graphical solution.

Here was the state before the rescan:

Code:
root@pve1:~# qm config 124
root@pve1:~# qm config 124
agent: 1
bios: ovmf
boot: order=scsi0;net0;ide0
cores: 4
cpu: host
efidisk0: ceph-nvme:124/vm-124-disk-0.qcow2,efitype=4m,pre-enrolled-keys=1,size=528K
ide0: none,media=cdrom
machine: pc-q35-8.1
memory: 4096
meta: creation-qemu=8.1.5,ctime=1722966437
name: LABSPSWKS001
net0: virtio=BC:24:11:FD:EB:36,bridge=LAB
numa: 1
ostype: win11
scsi0: ceph-nvme:vm-124-disk-0,cache=writeback,iothread=1,size=80G,ssd=1
scsihw: virtio-scsi-single
smbios1: uuid=a3796184-c892-43f9-baea-1e2a3fcd860a
sockets: 1
vmgenid: 02ea5246-d899-4517-9b20-147ddec94543

Again, thank you for your help :)
Regards.