On 11/2/25 07:18, Alyssa Ross wrote:
> Demi Marie Obenour <demiobenour@gmail.com> writes:
> 
>> On 11/1/25 08:17, Alyssa Ross wrote:
>>> Demi Marie Obenour <demiobenour@gmail.com> writes:
>>>
>>>> On 10/29/25 08:01, Alyssa Ross wrote:
>>>>> Demi Marie Obenour <demiobenour@gmail.com> writes:
>>>>>
>>>>>> Spectrum OS's host has no network access.  Updates must be downloaded by
>>>>>> VMs.  The downloads are placed into a bind-mounted directory.  The VM
>>>>>> can write whatever it wants into that directory.  This includes symlinks
>>>>>> that subsequent code might open, which would create a path traversal
>>>>>> vulnerability.  It also includes paths with names containing containing
>>>>>> terminal escape sequences, newlines, or other nastiness.  Furthermore,
>>>>>> the directory should not have any subdirectories either.
>>>>>>
>>>>>> Add a simple C program that checks for such ugliness and indicates
>>>>>> (via its exit code) if the VM misbehaved.  It also ensures that both
>>>>>> SHA256SUMS and SHA256SUMS.gpg are present.
>>>>>>
>>>>>> Signed-off-by: Demi Marie Obenour <demiobenour@gmail.com>
>>>>>> ---
>>>>>>  host/rootfs/Makefile                        |  6 +-
>>>>>>  lib/kcmdline-utils.mk                       |  6 ++
>>>>>>  tools/default.nix                           |  1 +
>>>>>>  tools/meson.build                           |  1 +
>>>>>>  tools/updates-dir-check/meson.build         |  4 ++
>>>>>>  tools/updates-dir-check/updates-dir-check.c | 94 +++++++++++++++++++++++++++++
>>>>>>  6 files changed, 110 insertions(+), 2 deletions(-)
>>>>>
>>>>> I still don't really understand why this needs to be a C program instead
>>>>> of find -H /path/to/dir -not -type f.  None of the other checks seem
>>>>> very necessary?
>>>>
>>>> I trust this code more than I trust (especially) the Busybox
>>>> implementation of find.
>>>
>>> This doesn't really make sense to me.  All of this is quite trivial find
>>> behaviour — not the sort of thing that's unlikely to have been widely
>>> tested.  No objection to GNU find though if it helps.
>>
>> I see: find with a -exec false to return an error if anything matching
>> is found?
>>
>> I'm way more familiar with C than with find, which is why I missed this.
> 
> Hmm, thinking about it some more I suppose there's a problem with find:
> there's no way to get it to exit as soon as it finds a matching file,
> with a failing error code, so it could end up running way too long.
> 
> So the C program is fine, I guess.
> 
>>>>> How are -Werror=pedantic and -DNDEBUG getting enabled in the first place?
>>>>
>>>> I believe Meson sets -DNDEBUG in some cases.
>>>
>>> Yes, if the user explicitly asks for it.
>>
>> I thought it was default for release builds.
> 
> Doesn't look like it:
> 
> https://github.com/mesonbuild/meson/blob/d00f840c573103c2d51aed2b169386f7acfe7026/mesonbuild/compilers/compilers.py#L255-L264
> 
> b_ndebug defaults to false.

Got it, thanks!

>>>>>> +		if (entry->d_name[0] == '.')
>>>>>> +			if (len == 1 || (len == 2 && entry->d_name[1] == '.'))
>>>>>> +				continue;
>>>>>> +		if (strcmp(entry->d_name, "SHA256SUMS") == 0) {
>>>>>> +			found_sha256sums = true;
>>>>>> +			continue;
>>>>>> +		}
>>>>>> +		if (strcmp(entry->d_name, "SHA256SUMS.gpg") == 0) {
>>>>>> +			found_sha256sums_gpg = true;
>>>>>> +			continue;
>>>>>> +		}
>>>>>> +		unsigned char c = (unsigned char)entry->d_name[0];
>>>>>> +		if (!((c >= 'A' && c <= 'Z') ||
>>>>>> +		      (c >= 'a' && c <= 'z')))
>>>>>> +			errx(1, "Filename must begin with an ASCII letter");
>>>>>> +		for (size_t i = 1; i < len; ++i) {
>>>>>> +			c = (unsigned char)entry->d_name[i];
>>>>>> +			if (!((c >= 'A' && c <= 'Z') ||
>>>>>> +			      (c >= 'a' && c <= 'z') ||
>>>>>> +			      (c >= '0' && c <= '9') ||
>>>>>> +			      (c == '_') ||
>>>>>> +			      (c == '-') ||
>>>>>> +			      (c == '.'))) {
>>>>>> +				if (c >= 0x20 && c <= 0x7E)
>>>>>> +					errx(1, "Forbidden subsequent character in filename: '%c'", (int)c);
>>>>>> +				else
>>>>>> +					errx(1, "Forbidden subsequent character in filename: byte %d", (int)c);
>>>>>> +			}
>>>>>> +		}
>>>>>
>>>>> Why do we care?  Surely we don't expect systemd-sysupdate to put
>>>>> filenames unescaped into a shell or something.
>>>>
>>>> Prevent escape sequence injection into terminals and logs is the
>>>> main reason.  Qubes OS has similar checks in some places, though they
>>>> are off by default for file copying.
>>>
>>> Doing this in a tool that's only used by sysupdate is a very ad-hoc way
>>> to protect against that.  I think if we want to protect against that
>>> sort of thing it should be done in one place, probably in virtiofsd.
>>
>> I think sysupdate is more likely to log unsanitized data, especially
>> as systemd-journald has no problems with it.
> 
> What's the difference between systemd-journald's behaviour and the
> logging we have?

I'm not familiar with s6 at all, but I think it is at least worth
investigating.  Also, all else equal it is best to reject invalid
untrusted input as early as possible.
-- 
Sincerely,
Demi Marie Obenour (she/her/hers)