[SOLVED] ZFS problem with UIDs

fireon · Sep 24, 2015

Hello,

i've installed here an testsystem.

Proxmox System on 1 HDD, ZFS Raid0

after that i add another 4 disks in Raid10

Code:

 pool: rpool
 state: ONLINE
  scan: none requested
config:


    NAME        STATE     READ WRITE CKSUM
    rpool       ONLINE       0     0     0
      sda3      ONLINE       0     0     0


errors: No known data errors


  pool: v-machines
 state: ONLINE
  scan: none requested
config:


    NAME                                            STATE     READ WRITE CKSUM
    v-machines                                      ONLINE       0     0     0
      mirror-0                                      ONLINE       0     0     0
        ata-WDC_WD5000AAKS-00UU3A0_WD-WCAYU7589703  ONLINE       0     0     0
        ata-Hitachi_HDS721050CLA362_JP1570HR2E4JKK  ONLINE       0     0     0
      mirror-1                                      ONLINE       0     0     0
        ata-WDC_WD5002ABYS-01B1B0_WD-WCASY2290382   ONLINE       0     0     0
        ata-WDC_WD5003ABYX-01WERA0_WD-WMAYP1214239  ONLINE       0     0     0


errors: No known data errors

I imported the uids:

Code:

root@proxmoxtest:~# zpool import -d /dev/disk/by-uuid/ -a
no pools available to import

but this had not worked, and yes after reboot the system comes not up, because sda from rpool was not sda anymore. I install the system fresh. But how can i fix this ugly uuid problem.

When i install a rpool with Raid1 the uuid is allready ok.

Thanks

dietmar · Sep 25, 2015

fireon said:
but this had not worked, and yes after reboot the system comes not up, because sda from rpool was not sda anymore.

But the BIOS/grub can still see that sda drive? Please can you tell at what stage the reboot fail?

sigxcpu · Sep 25, 2015

Boot with the "raid10" drives pulled. That way, your disk should be /dev/sda regardless.
Then import the pool with zpool import -d /dev/disk/by-id

I have 7 drives in my server in various zpools and none is found in /dev/disk/by-uuid.

fireon · Sep 25, 2015

sigxcpu said:
Boot with the "raid10" drives pulled. That way, your disk should be /dev/sda regardless.
Then import the pool with zpool import -d /dev/disk/by-id

I have 7 drives in my server in various zpools and none is found in /dev/disk/by-uuid.

Really? What the f....

See here, same problem on another server (only rpool). After Update and reboot, UUIDs are gone on "zpool status" Then i imported with your command (i think my command was wrong) and the system said:

Code:

zpool import -d /dev/disk/by-id
   pool: raid10
     id: 12666622118334122185
  state: UNAVAIL
 status: The pool was last accessed by another system.
 action: The pool cannot be imported due to damaged devices or data.
   see: [URL]http://zfsonlinux.org/msg/ZFS-8000-EY[/URL]
 config:


    raid10                                         UNAVAIL  insufficient replicas
      mirror-0                                     UNAVAIL  insufficient replicas
        ata-WDC_WD1003FBYX-18Y7B0_WD-WCAW32586323  UNAVAIL
        ata-WDC_WD1003FBYX-18Y7B0_WD-WCAW32477382  UNAVAIL
      mirror-1                                     UNAVAIL  insufficient replicas
        ata-WDC_WD1003FBYX-18Y7B0_WD-WCAW32608617  UNAVAIL
        ata-WDC_WD1003FBYX-18Y7B0_WD-WCAW32553928  UNAVAIL

but on this machine there was never an Raid10, so what?

Code:

pool: rpool
 state: ONLINE
  scan: none requested
config:


	NAME        STATE     READ WRITE CKSUM
	rpool       ONLINE       0     0     0
	  raidz1-0  ONLINE       0     0     0
	    sda3    ONLINE       0     0     0
	    sdb3    ONLINE       0     0     0
	    sdc3    ONLINE       0     0     0
	    sdd3    ONLINE       0     0     0

Sorry do not confuse, is another machine because the damaged is not here in Office (Report later)

sigxcpu · Sep 25, 2015

When you don't specify the zpool name in import, your are presented with importable pools only (it scans all the possible drives that are not part of an imported pool yet and looks for pool signatures) . The actual import requires the pool name or its id at the end.
For the above example with "UNAVAIL" things look pretty ugly, because the "raid10" pool signature has been found at least once, but ALL (??) the drives are missing. I'm wondering if all the drives are UNAVAIL how the signature could be found in the first place.

Try a simple zpool import without any parameters.

Anyway, let's keep the discussion around a single issue/server. If we jump between them, it is hard to follow.

fireon · Sep 25, 2015

sigxcpu said:
Anyway, let's keep the discussion around a single issue/server. If we jump between them, it is hard to follow.

Ok, will do that.

sigxcpu · Sep 25, 2015

I've just noticed something in what you've said:

After Update and reboot, UUIDs are gone on "zpool status"

What you are showing are IDs (/dev/disk/by-id) not UUIDs (/dev/disk/by-uuid).

So, on your first issue (with /dev/sda renaming) I would do the following:
- shutdown server
- pull out the drives in "v-machines" pool
- start server. The only disk left should be named /dev/sda so your "rpool" should be importable by this name if needed
- rm /etc/zfs/zpool.cache*
- reboot server
- At this time "rpool" should be OK and using "by-id" (long name, NOT UUID) if the scripts are scanning for /dev/disk/by-id
- shutdown server
- insert "v-machines" drives back
- start server
- if "v-machines" pool is not imported yet, do a

Code:

zpool import -d /dev/disk/by-id v-machines

Because of this mess you might run into some errors on mounting ZFS filesystems if you have created those in "v-machines".

fireon · Sep 25, 2015

Hello All,

i've done these all things. But:

- without the other disk the first disk is SDA
- with the other disk the first disk is SDE and system can't boot, then must change HDD order in BIOS
- i was not able to get the diskID for the systemdisk. but it is available
- i was able to get the ids for the v-machines pool, with export and then import the pool
- but after reboot all ids are gone

i have done and tested this four times, change settings.... but every time the same

ZFS version 0.6.4-4

i have my production system on 0.6.3-4 but not with SATA controller, have here an IBM SAS controller. Will do an downgrade and then i test it again.

supplement: downgrade is not possible

sigxcpu · Sep 25, 2015

I don't understand how could you import the pool using by-id the first time (export&import) but after reboot they are gone. Isn't the first time also after a reboot?

fireon · Sep 25, 2015

sigxcpu said:
I don't understand how could you import the pool using by-id the first time (export&import) but after reboot they are gone. Isn't the first time also after a reboot?

Pool export then import look like this:

Code:

pool: rpool
 state: ONLINE
  scan: none requested
config:


	NAME        STATE     READ WRITE CKSUM
	rpool       ONLINE       0     0     0
	  sde3      ONLINE       0     0     0


errors: No known data errors


  pool: v-machines
 state: ONLINE
  scan: none requested
config:


	NAME                                            STATE     READ WRITE CKSUM
	v-machines                                      ONLINE       0     0     0
	  mirror-0                                      ONLINE       0     0     0
	    ata-WDC_WD5000AAKS-00UU3A0_WD-WCAYU7589703  ONLINE       0     0     0
	    ata-Hitachi_HDS721050CLA362_JP1570HR2E4JKK  ONLINE       0     0     0
	  mirror-1                                      ONLINE       0     0     0
	    ata-WDC_WD5002ABYS-01B1B0_WD-WCASY2290382   ONLINE       0     0     0
	    ata-WDC_WD5003ABYX-01WERA0_WD-WMAYP1214239  ONLINE       0     0     0


errors: No known data errors

After i reboot this machine i have this:

Code:

pool: rpool
 state: ONLINE
  scan: none requested
config:


	NAME        STATE     READ WRITE CKSUM
	rpool       ONLINE       0     0     0
	  sde3      ONLINE       0     0     0


errors: No known data errors


  pool: v-machines
 state: ONLINE
  scan: none requested
config:


	NAME        STATE     READ WRITE CKSUM
	v-machines  ONLINE       0     0     0
	  mirror-0  ONLINE       0     0     0
	    sdd     ONLINE       0     0     0
	    sdc     ONLINE       0     0     0
	  mirror-1  ONLINE       0     0     0
	    sda     ONLINE       0     0     0
	    sdb     ONLINE       0     0     0


errors: No known data errors

sigxcpu · Sep 25, 2015

It looks like your /etc/zfs/zpool.cache didn't get (re-)created.
Please do this:
- rm -f /etc/zfs/zpool.cache*
- import the pools how you like them (with /dev/disk/by-id)
- zpool set cachefile=/etc/zfs/zpool.cache v-machines
- zpool set cachefile=/etc/zfs/zpool.cache rpool
- reboot

I don't know if zpool.cache is saved in initramfs. Just to make sure, before you reboot do

Code:

update-initramfs -u -k all

fireon · Sep 25, 2015

Thanks a lot, but this does also not work. Same problem. Ok, i setup the same machine again and use the same ZFS-Version like my production server. Than i do the same tests. I'll be back shortly

redmop · Sep 26, 2015

Make sure you run update-initramfs -u -k all when resetting the cachefile.

fireon · Sep 26, 2015

redmop said:
Make sure you run update-initramfs -u -k all when resetting the cachefile.

yes i have done this.

So, i installed once again, and i would like to set same zfs version like my production system. But i can't. I was not able to update to this version, and i was not able to downgrade to this version. I can't resolve the dependencies.

Code:

2 not fully installed or removed.
After this operation, 0 B of additional disk space will be used.
Setting up zfsutils (0.6.4-3~wheezy) ...
insserv: There is a loop between service umountfs and zfs-zed if stopped
insserv:  loop involving service zfs-zed at depth 6
insserv:  loop involving service zfs-import at depth 5
insserv: There is a loop between service umountfs and zfs-zed if stopped
insserv:  loop involving service umountfs at depth 3
insserv:  loop involving service umountnfs at depth 2
insserv:  loop involving service umountroot at depth 5
insserv:  loop involving service networking at depth 3
insserv:  loop involving service zvol at depth 6
insserv: exiting now without changing boot order!
update-rc.d: error: insserv rejected the script header
dpkg: error processing zfsutils (--configure):
 subprocess installed post-installation script returned error exit status 1
dpkg: dependency problems prevent configuration of zfs-initramfs:
 zfs-initramfs depends on zfsutils; however:
  Package zfsutils is not configured yet.


dpkg: error processing zfs-initramfs (--configure):
 dependency problems - leaving unconfigured
Processing triggers for initramfs-tools ...
update-initramfs: Generating /boot/initrd.img-3.10.0-11-pve
Errors were encountered while processing:
 zfsutils
 zfs-initramfs
E: Sub-process /usr/bin/dpkg returned an error code (1)

dabtech · Sep 27, 2015

Now it looks like you're running into this issue:
http://forum.proxmox.com/threads/23526-Proxmox-VE-4-0-beta2-released!?p=118609#post118609

fireon · Sep 29, 2015

Ok thanks, i think this errors are alle depending on my configuration. I is depending on the hdd order in bios. When you have the boot partition on all disks you did not have these problems. And yes no matter about the DiskID. It works with ID the same as well es with SDX. I tested this. It is a little pit confusing, but it is so.

Best Regards

fireon · Oct 16, 2015

One Mainproblem was this bug: https://github.com/zfsonlinux/zfs/issues/3652 It is now fixed with the Version in Enterpriserepo.

Search

Search

[SOLVED] ZFS problem with UIDs

fireon

Distinguished Member

dietmar

Proxmox Staff Member

sigxcpu

Well-Known Member

fireon

Distinguished Member

sigxcpu

Well-Known Member

fireon

Distinguished Member

sigxcpu

Well-Known Member

fireon

Distinguished Member

sigxcpu

Well-Known Member

fireon

Distinguished Member

sigxcpu

Well-Known Member

fireon

Distinguished Member

redmop

Well-Known Member

fireon

Distinguished Member

dabtech

Member

fireon

Distinguished Member

fireon

Distinguished Member