Demi Marie Obenour <demiobenour@gmail.com> writes:

>>> -$(dest): ../../scripts/make-erofs.sh $(PACKAGES_FILE) $(addprefix image/,$(FILES)) $(BUILD_FILES) build/empty build/fifo
>>> +$(dest): ../../scripts/make-erofs.sh $(PACKAGES_FILE) $(addprefix image/,$(FILES)) $(BUILD_FILES) build/empty build/fifo file-list.mk
>> 
>> Given that we don't include Makefile as a dependency, it probably
>> doesn't make sense to depend on other included Makefile fragments
>> either?
>
> If the user regenerates the file lists, it is more convenient for
> them to not need to call 'make clean', and having the dependency is
> harmless.

I just feel a bit weird about it.  It would of course be nice if
changing the make rules didn't require a make clean, but to actually get
that behaviour we'd need to add a dependency on Makefile to every single
rule (or this functionality would have to be built in to make, which
would make more sense…).  I worry that having it so that /sometimes/ you
don't have to run make clean after changing the make rules makes things
more confusing, because then you have to understand when…

Open to being told I'm putting theory ahead of practicality here, though.

>>> diff --git a/scripts/genfiles.awk b/scripts/genfiles.awk
>>> new file mode 100755
>>> index 0000000000000000000000000000000000000000..62863e78f157f1d9a0f6dbdb0f4380db9c9d48cb
>>> --- /dev/null
>>> +++ b/scripts/genfiles.awk
>>> @@ -0,0 +1,115 @@
>>> +#!/usr/bin/env -S LC_ALL=C LANGUAGE=C awk -E
>>> +# SPDX-License-Identifier: EUPL-1.2+
>>> +# SPDX-FileCopyrightText: 2025 Demi Marie Obenour <demiobenour@gmail.com>
>>> +function check_status(status) {
>>> +	if (status < 0) {
>>> +		printf "FATAL: getline: %s\n", status > "/dev/stderr";
>>> +		exit 1;
>>> +	}
>>> +	return status;
>>> +}
>>> +
>>> +function check_close(value,    status) {
>>> +	status = check_status(close(value));
>>> +	if (status != 0) {
>>> +		printf "FATAL: command exited with status %d\n", status > "/dev/stderr";
>>> +		exit status;
>>> +	}
>>> +}
>>> +
>>> +function shell_quote(command) {
>>> +	gsub(/'/, "'\\\\&'", command);
>>> +	return ("'" command "'");
>>> +}
>>> +
>>> +function get(command,          line, path, array_index, inode_type, mode, modes, symlink_count, symlinks, file_count, files, rc_count, rc_files, is_license, is_rc) {
>>> +	file_count = 0;
>>> +	symlink_count = 0;
>>> +	rc_count = 0;
>>> +	modes["120000"] = "symlink";
>>> +	modes["040644"] = "directory";
>>> +	modes["040755"] = "directory";
>>> +	modes["100644"] = "regular";
>>> +	modes["100755"] = "regular";
>>> +	print "# SPDX-License-Identifier: CC0-1.0";
>>> +	print "# SPDX-FileCopyRightText: Not Copyrightable (machine-written)";
>>> +	print "# Generated by scripts/genfile.awk, DO NOT EDIT!";
>>> +	while (check_status(command | getline line)) {
>>> +		if (line !~ /^[0-7]{6}\t/) {
>>> +			# this is a git bug
>>> +			print "FATAL: git ls-files output didn't start with a valid mode" > "/dev/stderr";
>>> +			exit 1;
>>> +		}
>>> +		path = substr(line, 8);
>>> +		if (path !~ /^[ -~]+$/) {
>>> +			# also a git bug
>>> +			print "FATAL: git ls-files didn't quote properly" > "/dev/stderr";
>>> +			exit 1;
>>> +		}
>>> +		if (path ~ /^\/|((^|\/)\.{0,2}($|\/))/) {
>>> +			# also a git bug
>>> +			printf "FATAL: git ls-files output non-canonical path '%s'\n", path > "/dev/stderr";
>>> +			exit 1;
>>> +		}
>>> +		if (path !~ /^[[:alnum:]_.+@/-]+$/) {
>>> +			printf "FATAL: filename '%s' has forbidden characters\n", path > "/dev/stderr";
>>> +			exit 1;
>>> +		}
>> 
>> I feel like this could be a lot nicer if we ran git ls-files outside
>> awk, and could then use its nice top-level matching syntax?
>
> That would work, but would need a wrapper script.

That's fine.  Make a scripts/genfiles.sh that runs scripts/genfiles.awk.