LizardFS anyone?

arglebargle · Apr 26, 2019

arnaudd said:
hello, will be interrested to test your debs and make some testing too

Stale, disregard.

andrea68 · Dec 28, 2020

arglebargle said:
Stale, disregard.

I'ts so bad?

hradec · Apr 14, 2025

I have been using LizardFS for more than 10 years now. I have never lost any data because of failed hard drives, and I did had my fair share of Hard drive failures in the past 10 years. I ran 1 chunkserver for each disk in a lxc container, and the master in a container as well, all on the same server. Essentially I use Lizardfs as a software RAID running on a single machine. (a on-the-fly per folder configurable RAID... it's pretty neat)

Having said that, there are quirks that you have to be aware:

1. the project is stale for a few years now. It's not a big deal, but there are bugs that won't be fixed: https://github.com/leil-io/saunafs/issues/7

2. After 10 years, my Lizardfs Master server uses almost 5GB of ram (the amount is directly connected to the amount of files on lizard)

3. I do use EC2_1, EC6_2 (you need at least 8 chunk servers for this, but you get 2 disks of redundancy, with just 1.33% of usage) and simple duplicates from 2 to 6. Any essential data I leave on 4-6 duplicates, which means the data is copied over to 4 to 6 disks. I did lost files with EC2_1 before. Never lost EC6_2 though. From my experience, you need a minimum of 2 redundancies (3 disks) with Lizardfs/Moosef, or you risk loosing data.

4. The BIGGEST problem with booth LizardFS and Moosefs are the amount of time it takes to replicate the needed copies when files are undergoal, and most importantly, takes an extreme amount of time to DELETE chunks that are not needed. The only way to free up space is to change the master config so it uses much more resources to do things faster... only then you get replication/deletion going.

5. The most important undocumented problem that can cause lost of data: I jut found out last week, that the master server seems to "forget" to save the metadata file every hour. In 10 years, that wasn't a problem. But last week, The master crashed after I have added a few extra terabytes of data to the filesystem. In the morning, I notice all my files since 2025 March 11 disappear. Then I noticed the last metadata file was from March 11!
So, essentially, because it didn't save the metadata, when it restarted, it could only load the metadata from a month ago. So all the changes in the filesystem since then got lost.
The biggest issue with this is that all chunkfiles created to hold the data of the new files created in the last month are still on disk, and there's no way to find those orphans and delete then to free up space.
I am now waiting to see if the master will figure the orphan chunks by itself and delete then, but since chunk deletion is slow, it will take some time to free up that space.
To correct the "forgetfullness" problem, I setup an hourly cron job that runs "lizardfs-admin save-metadata", which forces the master server to save the metadata file. So this problem will never happen again.

So, after 10 years, I'm considering migrating from LizardFS to SaunaFS, which seems to be a fork of LizardFS that is under active development. There's even compatibility between chunkserves from Lizard and Sauna, so in theory it's possible to just start by migrating the master server...

LizardFS has worked for me pretty well for 10 years... It did gave me a few headaches (like last week), but it does the job pretty well, and I do have 2 disks of redundancy on about 32TB of data, that only uses 47TB of disk space.

my 2 cents... hope that helps someone who still thinks about LizardFS!

If you are, I would recommend to try SaunaFS instead, since it's in active development... But I'm not sure how trustful the code is... LizardFS has been good for me for the past 10 years!

Search

Search

LizardFS anyone?

arglebargle

New Member

andrea68

Renowned Member

hradec

Member

We value your privacy