Demi Marie Obenour writes: >>> diff --git a/scripts/genfiles.awk b/scripts/genfiles.awk >>> new file mode 100644 >>> index 0000000000000000000000000000000000000000..6fe327fd0a314d226dbce23854aa8f119e9c8f34 >>> --- /dev/null >>> +++ b/scripts/genfiles.awk >>> @@ -0,0 +1,120 @@ >>> +#!/usr/bin/env -S LC_ALL=C LANGUAGE=C awk -E >>> +# SPDX-License-Identifier: EUPL-1.2+ >>> +# SPDX-FileCopyrightText: 2025 Demi Marie Obenour >>> +BEGIN { >>> + RS = "\n"; >>> + FS = "\t"; >>> + file_count = 0; >>> + symlink_count = 0; >>> + rc_count = 0; >>> + is_rc = 0; >>> + exit_code = 0; >>> + done = 0; >> >> awk variables are implicitly initialized to 0 when you try to do >> arithmetic on an undefined variable, so no need for these. > > GNU Awk can lint against that. I used its lint mode because it also > warns against non-portable constructs. Also, an undefined awk > variable used as an array subscript is treated as the empty string, > not 0, which could lead to confusion. Okay, happy to leave them if you want. >>> + >>> +filename ~ /^image\/etc\/s6-rc\// { >>> + if (mode != "regular") { >>> + fail("s6-rc-compile input '" filename "' isn't a regular file"); >>> + } >>> + rc_count += 1; >>> + rc_files[rc_count] = filename; >> >> rc_files[rc_count++] >> >> (will make it 0-indexed though so update the loops too) > > I think this might break without explicit variable initialization. It does not. >>> + "# Generated by scripts/genfile.sh. Any changes will be overwritten.\n" \ >>> + "FILES ::=") > out_file; >> >> I note the change to ::=. Do you think we should do that across the >> board in our Makefiles? > > POSIX specifies ::= and it has better semantics in most cases, but I don't > know if the BSD makes implement it. ::= causes the RHS to be expanded immediately, > so subsequent changes in variables referenced by it do not affect the LHS. Happy to change, but would prefer we did it all at once. >>> + for (array_index = 1; array_index <= file_count; array_index += 1) { >>> + printf " \\\n\t%s", files[array_index] > out_file; >>> + } >>> + printf ("\n\n" \ >>> +"# These are separate because they need to be included, but putting\n" \ >>> +"# them as make dependencies would confuse make.\n" \ >>> +"LINKS ::=") > out_file; >>> + for (array_index = 1; array_index <= symlink_count; array_index += 1) { >>> + printf " \\\n\t%s", symlinks[array_index] > out_file; >>> + } >>> + printf "\n\nS6_RC_FILES ::=" > out_file; >>> + for (array_index = 1; array_index <= rc_count; array_index += 1) { >>> + printf " \\\n\t%s", rc_files[array_index] > out_file; >>> + } >>> + printf "\n" > out_file; >>> + if (close(out_file)) { >>> + print ("Cannot close output file: " ERRNO "\n") > "/dev/stderr"; >>> + exit 1; >>> + } >>> +} >>> diff --git a/scripts/genfiles.sh b/scripts/genfiles.sh >>> new file mode 100755 >>> index 0000000000000000000000000000000000000000..77a8d95e88b6851be9447698556efe4f1eab174b >>> --- /dev/null >>> +++ b/scripts/genfiles.sh >>> @@ -0,0 +1,29 @@ >>> +#!/usr/bin/env -S LC_ALL=C LANGUAGE=C bash -- >> >> env -S is not portable, and I don't think anything here needs bash >> specifically. > > $'\t' doesn't work with all shells, though I believe it is either > part of the current POSIX standard or will be added. I'll use > /usr/bin/env bash, which breaks if the script is renamed to something > starting with '-'. It's in 2024. I'd prefer this was /bin/sh like our other scripts — we already have shellcheck checking for non-portable constructs. >>> +case $0 in >>> +(/*) cd "${0%/*}/..";; >>> +(*/*) cd "./${0%/*}/..";; >>> +(*) cd ..;; >>> +esac >> >> Perhaps we could use git rev-parse --show-toplevel? > > git ls-files doesn't have that option. I mean we could cd "$(git rev-parse --show-toplevel)", and then be in a consistent starting place. >>> +for i in host/rootfs img/app vm/sys/net; do >>> + output_file=$i/file-list.mk >>> + { >>> + git -C "$i" -c core.quotePath=true ls-files $'--format=%(objectmode)\t%(path)' -- image | >>> + sort -t $'\t' -k 2 >> >> TIL sort -t and -k! 🤯 >> >>> + echo DONE >> >> Why do we need this? > > To avoid producing any output file if the input is truncated. > >>> + } | >>> + gawk -v "out_file=$output_file.tmp" -E scripts/genfiles.awk >> >> Why not stdout? > > The output file is created by awk so that it is only created if > nothing went wrong. For both of these, we already have exit status to communicate if something goes wrong. When would output get truncated without that being an unsuccessful exit? >>> + if [ -f "$output_file" ]; then >>> + # Avoid changing output file if it is up to date, as that >>> + # would cause unnecessary rebuilds. >>> + if cmp -s -- "$output_file.tmp" "$output_file"; then >>> + rm -- "$output_file.tmp" >>> + continue >>> + else >>> + astatus=$? >>> + if [ "$astatus" != 1 ]; then exit "$astatus"; fi >> >> Could avoid the need for the variable and multiple ifs. Up to you >> whether you prefer it: >> >> set +e >> cmp -s -- "$output_file.tmp" "$output_file" >> set -e >> case $? in >> 0) >> rm -- "$output_file.tmp" >> continue >> ;; >> 1) >> ;; >> *) >> exit $? >> ;; >> esac > > This might set $? to the return value of 'set -e' (0). Whether or > not it actually does is at least not obvious from reading the code. Oh good point. Let's stick with your way then.