PBS Task history creating "bad" files at times

so it was just Nextcloud complaining about the backslash after all. this is unfortunate, but syncing PBS task logs to Windows(-compatible) systems is not really something we care about ;)
 
so it was just Nextcloud complaining about the backslash after all. this is unfortunate, but syncing PBS task logs to Windows(-compatible) systems is not really something we care about ;)
There is zero Windows anything involved. I don't know why you think there is.
 
NextCloud doesn't allow syncing of files with "\" in their name. the reason is that Windows cannot handle them ;)
 
NextCloud doesn't allow syncing of files with "\" in their name. the reason is that Windows cannot handle them ;)

That is irrelevant, there is _zero_ Windows involved. My desktop isn't Windows, it's Linux, same for the nextCloud server (Linux). I uploaded via the webGUI, which is my browser. Where do you think Windows comes into play at all here? Because there's zero Windows involved.
 
That is irrelevant, there is _zero_ Windows involved. My desktop isn't Windows, it's Linux, same for the nextCloud server (Linux). I uploaded via the webGUI, which is my browser. Where do you think Windows comes into play at all here? Because there's zero Windows involved.
There hasn`t to be any Windows involved. What fabian meant was that if the Nextcloud developers code it to be compatible with Windows it won't work with "\" no matter if you are using Linux, Mac or whatever, as it otherwise wouldn't sync with people using Windows.
 
yes. NextCloud doesn't allow back slashes in the file name. it also doesn't allow regular slashes for the same reason - both are used as path separators on one or the other OS, and since they want to support clients across the board, they forbid both variants. if you want improved compat with Windows client systems, you can also forbid other characters that are a problem on Windows (such as ':') - but that is optional (don't ask me why, I didn't write NextCloud ;)). you don't need to run Windows anywhere for "\" to be forbidden in Nextcloud - they always are.
 
  • Like
Reactions: BloodyIron
to be honest, why is pbs using special chars like \ , : or ! at all in filenames ?

i have rarely seen so weird file naming and it's really making users life hard when it comes to script processing, i'm sure there are quite some people pulling their hair out on this.

let's give an example

Code:
# find . |grep 000013C3|while read file;do echo $file;ls -la "$file";done

./UPID:pbs01:003DF94A:25FC9635:000013C3:6950546E:backup:dsx2dpvex2dcluster1x3avm-187:pbsbackup_pve-cluster1@pbs:

ls: cannot access './UPID:pbs01:003DF94A:25FC9635:000013C3:6950546E:backup:dsx2dpvex2dcluster1x3avm-187:pbsbackup_pve-cluster1@pbs:': No such file or directory
 
  • Like
Reactions: beisser
because it's a Linux based piece of software, and Linux doesn't care about such things (paths are sequences of bytes). shell handling does require extra attention, that's why tools like find and xargs have special modes:

Code:
$ find . -iname "*001D9C7A*" -print0 | xargs -0 ls -lha
-rw-r--r-- 1 backup backup 325 Sep  5 13:28 './7A/UPID:yuna:00083F6E:001D9C7A:00000000:68BAC93E:benchmark:tank\x3ahost-benchmark:test@pbs:'
-rw-r--r-- 1 backup backup 36K Sep  5 14:00  ./7A/UPID:yuna:00083F6E:001D9C7A:00000001:68BAD0C0:prunejob:tank:root@pam:

you will also notice that your shell (hopefully ;)) escapes the back slash when tab completing, so things like

Code:
$ ls -lha 7A/UPID:yuna:00083F6E:001D9C7A:0000000<tab>
UPID:yuna:00083F6E:001D9C7A:00000000:68BAC93E:benchmark:tank\\x3ahost-benchmark:test@pbs:  UPID:yuna:00083F6E:001D9C7A:00000001:68BAD0C0:prunejob:tank:root@pam:

will also work and do the right thing. also note how the "ls" output in the first snippet quotes the path, so that the backslash has no effect when copying.

systemd has the same issue (and a similar solution) -> unit filenames cannot contain /, so it's encoded as -, which in turn means - needs to be escaped, which uses \x2d, which works just fine. (they do escape a few more things)
 
  • Like
Reactions: RolandK
because it's a Linux based piece of software, and Linux doesn't care about such things (paths are sequences of bytes). shell handling does require extra attention, that's why tools like find and xargs have special modes:

Code:
$ find . -iname "*001D9C7A*" -print0 | xargs -0 ls -lha
-rw-r--r-- 1 backup backup 325 Sep  5 13:28 './7A/UPID:yuna:00083F6E:001D9C7A:00000000:68BAC93E:benchmark:tank\x3ahost-benchmark:test@pbs:'
-rw-r--r-- 1 backup backup 36K Sep  5 14:00  ./7A/UPID:yuna:00083F6E:001D9C7A:00000001:68BAD0C0:prunejob:tank:root@pam:

you will also notice that your shell (hopefully ;)) escapes the back slash when tab completing, so things like

Code:
$ ls -lha 7A/UPID:yuna:00083F6E:001D9C7A:0000000<tab>
UPID:yuna:00083F6E:001D9C7A:00000000:68BAC93E:benchmark:tank\\x3ahost-benchmark:test@pbs:  UPID:yuna:00083F6E:001D9C7A:00000001:68BAD0C0:prunejob:tank:root@pam:

will also work and do the right thing. also note how the "ls" output in the first snippet quotes the path, so that the backslash has no effect when copying.

systemd has the same issue (and a similar solution) -> unit filenames cannot contain /, so it's encoded as -, which in turn means - needs to be escaped, which uses \x2d, which works just fine. (they do escape a few more things)

In my interpretation of the question "why is PBS using these characters" (paraphrase) it's less that Linux "can" do this, and more about "what is the actual benefit of using these characters vs not using them at all when PBS makes/interacts with these files?". To me, I don't see an upside to these characters being "acceptable" for PBS to use at all, and while yes, there are aspects where it _does not_ cause problems, there are aspects where it does. And sure, there may be a reason for PBS to keep using these characters, but so far I don't see one or more reasons.

I'd say it would be worthwhile to either:
a) actually identify the tangible benefits of keeping the characters (as in, something PBS users actually could/do care about)
or
b) have PBS stop using these characters at all for filenames

So...?
 
  • Like
Reactions: Onslow and RolandK
In my interpretation of the question "why is PBS using these characters" (paraphrase) it's less that Linux "can" do this, and more about "what is the actual benefit of using these characters vs not using them at all when PBS makes/interacts with these files?". To me, I don't see an upside to these characters being "acceptable" for PBS to use at all, and while yes, there are aspects where it _does not_ cause problems, there are aspects where it does. And sure, there may be a reason for PBS to keep using these characters, but so far I don't see one or more reasons.

I'd say it would be worthwhile to either:
a) actually identify the tangible benefits of keeping the characters (as in, something PBS users actually could/do care about)
or
b) have PBS stop using these characters at all for filenames

So...?

the benefits of a) are
- better (human) readability (compared to more involved escaping)
- no need for migrating between old and new formats
- compat with PVE/.. that use identical schemes (we share code across products)

the main downsides are
- if you try to store those files on systems that are not meant to store them, it might not be possible
- interactions in the shell might need additional care/quoting/.. for some of the characters (notably, this is not the case for ':', which is the only "exotic" character used in the datastore files)
 
  • Like
Reactions: Chris
the benefits of a) are
- better (human) readability (compared to more involved escaping)
- no need for migrating between old and new formats
- compat with PVE/.. that use identical schemes (we share code across products)

the main downsides are
- if you try to store those files on systems that are not meant to store them, it might not be possible
- interactions in the shell might need additional care/quoting/.. for some of the characters (notably, this is not the case for ':', which is the only "exotic" character used in the datastore files)
Yeah I'm going to straight up disagree here.

1. If you think "UPID:redacted:000289CB:3BEF2DE3:000003B1:64B24022:backup:Redacted/x3act-182:root@pam:" is "better" human readability, you need to revisit how you think humans read text. That's an extremely long string of text and the problematic characters make it _HARDER_ to read, not easier/better in any way.
2. No need for migrating between old and new formats isn't a benefit, that's just laziness.
3. Compatability with other PVE formats is a very shallow benefit, that frankly mostly benefits proxmox developers far more than it does actual users of the system. Again, this is barely noteworthy as a benefit.

We've seen problems with the characters in this very thread, it creates very real administrative problems, which should be something you and other proxmox developers should actually care about. Frankly the administrative problems I've had to deal with for these characters far outweigh the benefits you speak to here.

I think it would benefit the proxmox ecosystem at-large a lot more if instead of just being a stick in the mud, pushing back so hard on this, you actually took the merit of the administrative problems seriously. Realised it needs to be changed, and started figuring out something better than "we like it more, so it stays the same".

Like I'm glad that you fleshed out the pros you speak to, but if that's the extent of what you can come up with, it sure to me is self-evident this _NEEDS_ to change a lot more than stay the same.
 
  • Like
Reactions: Onslow