Hi. Upgraded Proxmox on three nodes, each of them hosting 2 dedicated SSD drives for sheepdog; xfs-mounted on /var/lib/sheepdog/disc{0,1}.
The initial setup under proxmox-2.3 was done on each node and clustering worked, then.
Now, after upgrade to 3.0, sheepdog won't start, and complains:
Aug 19 01:57:18 [main] init_obj_path(245) /var/lib/sheepdog/disc0 is meta-store, abort
I checked in the code and this error message is spawned because there are 'epoch' and 'config' files in the /var/lib/sheepdog/disc{0,1} directories.
A ls -l /var/lib/sheepdog/disc0 shows (disc1 is similar):
-rw-r----- 1 root root 40 May 7 12:17 config
drwxr-x--- 2 root root 4096 Aug 18 19:00 epoch
drwxr-x--- 2 root root 6 May 7 12:17 journal
-rw-r----- 1 root root 0 May 7 12:15 lock
drwxr-x--- 4 root root 557056 Aug 18 19:13 obj
-rw-r--r-- 1 root root 4770639 Aug 18 19:15 sheep.log
srwxr-xr-x 1 root root 0 May 12 22:27 sock
-rw-r--r-- 1 root root 0 May 7 12:14 startup
These files were created by sheepdog itself, so it seems mixed up in its own mess.
Now, what should I do? should I remove everything but the obj directory and try again? What can I do to safely recover this cluster?
Thanks,
The initial setup under proxmox-2.3 was done on each node and clustering worked, then.
Now, after upgrade to 3.0, sheepdog won't start, and complains:
Aug 19 01:57:18 [main] init_obj_path(245) /var/lib/sheepdog/disc0 is meta-store, abort
I checked in the code and this error message is spawned because there are 'epoch' and 'config' files in the /var/lib/sheepdog/disc{0,1} directories.
A ls -l /var/lib/sheepdog/disc0 shows (disc1 is similar):
-rw-r----- 1 root root 40 May 7 12:17 config
drwxr-x--- 2 root root 4096 Aug 18 19:00 epoch
drwxr-x--- 2 root root 6 May 7 12:17 journal
-rw-r----- 1 root root 0 May 7 12:15 lock
drwxr-x--- 4 root root 557056 Aug 18 19:13 obj
-rw-r--r-- 1 root root 4770639 Aug 18 19:15 sheep.log
srwxr-xr-x 1 root root 0 May 12 22:27 sock
-rw-r--r-- 1 root root 0 May 7 12:14 startup
These files were created by sheepdog itself, so it seems mixed up in its own mess.
Now, what should I do? should I remove everything but the obj directory and try again? What can I do to safely recover this cluster?
Thanks,