(Developer) question about .chunk folder

Max Carrara · Jun 12, 2024

Der Harry said:
But as mentioned ... you learned at semester 1 or 2 in computer science the Big O notation (https://en.wikipedia.org/wiki/Big_O_notation).
You learned, that if a result is 0.12 * 10^2. Hardware changes the coefficent ... so it can make a 1.24 * 10^2 or a 0.03 * 10^2.
The exponent of the 10^2 is changed very very little. A factor of 10 - 100 or 1000 is very hard to be archived with hardware.
Usually a professor teaches the NP problem. NP problems can't be solved with the "fastest" hardware.

I am aware and yes that is correct - but Big O notation, as I mentioned in my previous reply, does not give you any meaningful in our case.

To quote from the same article you linked:

"Big O notation characterizes functions according to their growth rates: different functions with the same asymptotic growth rate may be represented using the same O notation."

So yes, while hardware "just changes the coefficient" and is thus not "important" in terms of classifying the order of whatever algorithm that is being used, that "coefficient" nevertheless has a very meaningful impact for real world applications.

Moreover, you cannot just categorize filesystems by Big O (it's certainly possible, but it is damn hard) - the algorithms SMB or NFS could very well be in the same order as those that ext4 uses. The fact that those filesystems are slow is because they usually have to wait on other stuff to happen first. NFS usually has to wait until all writes have been committed to disk, for example. As far as I know, the NFS server still runs with sync by default (correct me if I'm wrong), so all writes are synchronous and not asynchronous. This makes a HUGE difference if you're on HDDs. Conversely, if you use something like a SLOG on ZFS, you can actually improve the speed of NFS significantly.

If you really want to see how fast NFS performs, have that test machine of yours act as the server instead of creating a localhost-to-localhost connection. Have a decently fast client machine so you can rule out any slowdowns caused by the client, and a decently fast connection with low latency so you can rule out bandwidth and latency issues. If you want, test differences between client sync / async and server sync / async as well (so four additional tests). This separation will also allow you to rule out any interferences that running your tests might cause or whatever else is going on on your host system. Removing the coefficients, as you say.

So I agree that your overall conclusion is correct - NFS and SMB should be avoided if one can afford to choose something else - but while I appreciate your efforts, your application of the theory you're referencing is incorrect.

Der Harry · Jun 12, 2024

Max Carrara said:
Moreover, you cannot just categorize filesystems by Big O (it's certainly possible, but it is damn hard) - the algorithms SMB or NFS could very well be in the same order as those that ext4 uses.

Simply not true

if ext4 is 100% = O(n) = 1 "the baseline"
and smb is 94910,08% what is the O(n) notation?

(remember - smb is localhost on localhost wrting to the same hardware to the same ext4 filesystem!)

is it

a * n
n^2
log(n)
e^n

According to my understanding, this smells - for a given hardware - something that is e^n

In very simple programmer terms -> don't use. There is no hardware on earth that will make this 95000% near to the 100% baseline.

Edit:

It is left for you as exercise to run that with n=10000 files or n = 1.000.000 files. I personally not care about the e^a*n (the factor a) in the equotation.

fabian · Jun 12, 2024

despite what multiple people already told you (chunk file creation itself only being a small part of why NFS/SMB/.. are performing worse, and the thing you are benchmarking being a rather poor proxy for actual real world performance, and proper benchmarking requiring reproducible setups, and ..), there is one more thing I can't not notice - I think you don't understand what big O notation means (and what it is and isn't useful for).. what is the "n" in your example? and what are you even counting?

sterzy · Jun 12, 2024

Der Harry said:
According to my understanding, this smells - for a given hardware - something that is e^n

You are trying to approximate the asymptotic growth in runtime of an algorithm relative to its input with one datapoint. Sorry, but this is entirely impossible, just ... mathematically.

To elaborate, with just one data point both these file systems could be O(n) = n or whatever really just offset by a constant factor: 949.1008. So SMB could be O(n) = n*949.1008 whereas ext4 is O(n) = n, but since we are talking about asymptotic growth here, constant factors become meaningless. Both SMB and ext4 are in the same complexity class in this scenario: O(n) = n.

Anyway, none of this really has any impact on the complexity class of our algorithms themselves. NFS, SMB, ext4 and so on will roughly (extremely roughly) take more or less the same time for any fstat(2) or read(2) call on average. This will differ between each other by some factor, yes, but that factor won't grow with the amount of system call we issue. Hence, this factor can be treated as a constant and ignored when doing complexity class analysis. However, in a real-life setups such a constant factor could make the difference between something taking 10 seconds or ten hours.

Tl;Dr: You can't do any kind of asymptotic growth analysis here with just one data point for each file system. You'd need to test each file system by e.g., creating 1, 10, 100, 1000, 10000, 100000, 1000000 chunks to make any such observation empirically. Meaning, you need to observe the growth in run time with different ns where n is the amount of data/chunks written or something like that.

Der Harry said:
Problem C: The chunk exists in the index, but not on the disk.
Problem C:

Very simple. Run the read(2) operation on the filesystem anyway! It will fail! We delete the id from our index.

However the user has a problem! The file is gone - something (no PBS!) deleteted it)

Something very bad happened beyond the scope of the index and pbs anyway.

Also regarding this, since I've been wondering about this for a while: How would you know you are encountering this problem if you only consult the cache? The cache here is wrong, so you'd need to consult the disk. However, since you can't know whether you are in this scenario by only checking the cache, you will always need to check the disk anyway. Your cache becomes obsolete as each check in the cache suddenly needs to be followed up by a check on the disk as well, which just mean you are checking both now instead of just the disk. I don't see how that is an improvement.

Max Carrara · Jun 12, 2024

Der Harry said:
if ext4 is 100% = O(n) = 1 "the baseline"

No. This is not how Big O works. This is not how any of this works. You cannot just guess the order of a filesystem.

ext4 uses a hashed B-Tree under the hood, which has a time complexity of O(log n). If you change the filesystem, the time complexity for specific operations might change at most, because some of them might use some slightly different tree-based data structures under the hood, like e.g. a B+-Tree instead of a B-Tree. Even then factors like spacial locality (for the purposes of CPU caching) will play a role.

Even if you use NFS, the time complexity will remain the same or very similar, because if it was worse, you would already notice much earlier. What you're measuring is still an unknown coefficient, because your tests do not take the time into account that it takes for things to be committed to your disk.

Because this thread has now derailed from its original purpose and we can all agree that NFS/SMB are subpar performance-wise and that adding (another) caching layer is useless or even detrimental to the overall performance of PBS, I'll consider this matter to be dealt with. Thank you for your efforts in creating these tests, but please inform yourself about the theory of algorithmic complexity first before trying to apply it haphazardly to prove a point that we're already agreeing on. There's nothing left to discuss here - and as you saw yourself, we're actively working on updating our documentation.

Der Harry · Jun 13, 2024

Max Carrara said:
No. This is not how Big O works. This is not how any of this works. You cannot just guess the order of a filesystem.

I might be in Austra in a few months. I accept your infivation for a beer!

I rerun the tests with n =10, ... n = 1 mio

You find the saturation test in my results: https://github.com/egandro/pbs-storage-perf-test/blob/main/results.md

What these tests do - you can see here: https://github.com/egandro/pbs-storage-perf-test/blob/main/create_random_chunks.py

- Same computer
- Same disk
- Same memory

The ext4 tests run native on the filesystem.
The smb test run native on the ext4 filesyste > but < with a Samba Server on localhost and and Samba mount on the same localhost.

I totally accept and i totally see - if we put "more" hardware and create "bigger" chunks the numbers will increase.

This specific test is "how well does ext4 perform in contrast to smb with artifical operations similar to the operations PBS does".

So here the facts

n: 10.000 -> 100.000 (Real world szenario your backup over the time increases and creates more chunks)

ext4: 1.39 sec -> 6.99 sec
smb: 39 sec -> 341 sec (= 5.5 min)

Conclusion - it will get dramatically slower.

n: 100.000 -> 1.000.000 (I don't know if anybody on our planet has 1 mio chunks in PBS!)

ext4: 78 sec
smb: 3558 sec (= 1 hour)

As we write to the same disk, same, hardware, same physical ext4. So in the 3558 sec the 78 sec are included as time smb writes to disk.

So when comparing ext4 with smb (same hardware, same amount of operations ...)

What kind of function describes that at best?

n: 1.000.000

f(78 sec) = 3558sec

Linear?
a * 78 = 3558sec -> : a=45

Quadratic?
78^a = 3558sec -> : a=1,876864898 (78 ^ 1,876864898 = 3.557,99)

Conclusion 1:

What complexity / function class describes putting files (with an artificial method simulation PBS operations) on a Ext4 vs. SMB filesystem best?

Answer: O(n): n²

Conclusion 2:

Comparing samba to ext4 (on the same test hardware, same test machine, same ram) indicates for n>=100.000 -> 1.000.000 up to a factor 45 worse performance.

(I want my beer cool!)

Max Carrara · Jun 13, 2024

Der Harry said:
n: 10.000 -> 100.000 (Real world szenario your backup over the time increases and creates more chunks)

Over what time? An hour? A day? A month? A year? Your tests create them all at once.

Der Harry said:
n: 10.000 -> 100.000 (Real world szenario your backup over the time increases and creates more chunks)

ext4: 1.39 sec -> 6.99 sec
smb: 39 sec -> 341 sec (= 5.5 min)

Conclusion - it will get dramatically slower.

If by "slower" you mean "it takes longer" because you create 100.000 chunks at once instead of 10.000, then yes.

Here's an analogy: It takes more time to fill a large container of water than a small one. This is what you're testing.

Der Harry said:
n: 100.000 -> 1.000.000 (I don't know if anybody on our planet has 1 mio chunks in PBS!)

I think we have about 1 million chunks in a server lying around somewhere. That's about 4TB. That's relatively common.

Der Harry said:
n: 100.000 -> 1.000.000 (I don't know if anybody on our planet has 1 mio chunks in PBS!)

ext4: 78 sec
smb: 3558 sec (= 1 hour)

Again, it obviously takes longer because you're creating more chunks.

Here's another experiment:

Python:

sum_a = 0

for i in range(10):
    sum_a += i

Python:

sum_b = 0

for i in range(100):
    sum_b += i

Python:

sum_c = 0

for i in range(1000):
    sum_c += i

Python:

sum_d = 0

for i in range(10000):
    sum_d += i

Python:

sum_e = 0

for i in range(100000):
    sum_e += i

Python:

sum_f = 0

for i in range(1000000):
    sum_f += i

Does the time to execute these code snippets scale linearly or quadratically?

Der Harry said:
What kind of function describes that at best?

n: 1.000.000

f(78 sec) = 3558sec

Linear?
a * 78 = 3558sec -> : a=45

Quadratic?
78^a = 3558sec -> : a=1,876864898 (78 ^ 1,876864898 = 3.557,99)

This is so wrong on so many levels that I don't even know where to begin. You cannot just come up with some arbitrary constant that "describes" your growth (which it doesn't do here anyway, by the way). You can reach I will not go into this any further, because it would take me too long to explain.

Der Harry said:
Conclusion 1:

What complexity / function class describes putting files (with an artificial method simulation PBS operations) on a Ext4 vs. SMB filesystem best?

Answer: O(n): n²

No. This is. Not. How. Big. O. Works. This is immeasurably wrong.

Der Harry said:
Conclusion 2:

Comparing samba to ext4 (on the same test hardware, same test machine, same ram) indicates for n>=100.000 -> 1.000.000 up to a factor 45 worse performance.

Yes. SMB performs worse than ext4. That doesn't have anything to do with Big O, growth rates or anything of the sort. It's simply slower. Because of reasons that have been stated before.

Now, because you made the effort to provide some data and some graphs to back up your claims, allow me to ask you one more thing:

screenshot_20240613_131648-edited-png.69704

Do you think this is a linear or quadratic function? Do you think this is linear or quadratic growth? I'm not sarcastic here, by the way. I want you to deeply think about this.

Now, apologies if I seem rude here, but it appears to me that you have a fundamental misunderstanding of mathematics. Yet, you speak with absolute confidence that your claims and conclusions are correct. They're not, and there's no point in continuing this any further. I'm very sorry, but I highly suggest to read a mathematics textbook. I think we're done here, and no, nobody owes you a beer. i wish you a nice day.

Der Harry · Jun 13, 2024

Max Carrara said:
Yes. SMB performs worse than ext4. That doesn't have anything to do with Big O, growth rates or anything of the sort. It's simply slower. Because of reasons that have been stated before.

Finally

I introduced the O(n) notation because of one single statement one of you and you made.

Max Carrara said:
So yes, while hardware "just changes the coefficient" and is thus not "important" in terms of classifying the order of whatever algorithm that is being used, that "coefficient" nevertheless has a very meaningful impact for real world applications.

So - chaging the hardware - Epyc CPU 128 Cores with 5.5 Ghz and running all tests from a RAM Disk.

The overall "performance gain" ist 100% irrelevant. If you wait 55min instead of 60min - what do you gain? On "real world applications" that is the definition of not acceptable.

Choosing the right filesystem is a 70 second thing (probably a 30 second thing on the Eypc)

That - only that - and this was the only reason intoducint the O(n) notation.

We finally have an agreement that >> only << samba itself is the root issue.

Your colleges droped a stamtent "hey your tests are bad, beeecauuuse you didn't use a lot of hardware to test it". That is was triggered me! I worked on this for days and that statement "yeah use more hardware for your tests" is a red hering.

The tests showed - with artificial tests - that the filesystem has a dramatic impact on PBS to the level where it can stop working!

- I totally agree if you have a bad network.
- High latency.
- Slow disks
- Few ram

It can even be worse. But the Eypic 128 Core / with Fiber Optics won't help you with SMB.

Max Carrara said:
Now, apologies if I seem rude here, but it appears to me that you have a fundamental misunderstanding of mathematics. Yet, you speak with absolute confidence that your claims and conclusions are correct. They're not, and there's no point in continuing this any further. I'm very sorry, but I highly suggest to read a mathematics textbook. I think we're done here, and no, nobody owes you a beer. i wish you a nice day.

Let's keep this professional. I proved the point that there is a "dramatic increase of waiting time" if you can't name a function for "dramatic increase" that's ok for me.

Search

Search

(Developer) question about .chunk folder

Max Carrara

Active Member

Der Harry

Active Member

fabian

Proxmox Staff Member

sterzy

Proxmox Staff Member

Max Carrara

Active Member

Der Harry

Active Member

Max Carrara

Active Member

Der Harry

Active Member