On 11/13/25 13:01, Alyssa Ross wrote: > Demi Marie Obenour writes: > >> On 11/13/25 08:21, Alyssa Ross wrote: >>> Demi Marie Obenour writes: >>> >>>> +static void checkdir(int fd) >>>> +{ >>>> + DIR *d = fdopendir(fd); >>>> + if (d == NULL) >>>> + err(EXIT_FAILURE, "fdopendir"); >>>> + // If there is an I/O error while there are dirty pages outstanding, >>>> + // the dirty pages are silently discarded. This means that the contents >>>> + // of the filesystem can change behind userspace's back. Flush all >>>> + // dirty pages in the filesystem with the directory to prevent this. >>>> + if (syncfs(fd) != 0) >>>> + err(EXIT_FAILURE, "syncfs"); >>>> + for (;;) { >>>> + errno = 0; >>>> + struct dirent *entry = readdir(d); >>>> + if (entry == NULL) { >>>> + if (errno) >>>> + err(EXIT_FAILURE, "readdir"); >>>> + break; >>>> + } >>>> + assert(entry->d_reclen > offsetof(struct dirent, d_name)); >>> >>> Would be a POSIX violation for d_name not to be valid I think. >> >> Indeed it would be, but I preferred to check that explicitly. >> >>>> + size_t len = strnlen(entry->d_name, entry->d_reclen - offsetof(struct dirent, d_name)); >>> >>> POSIX also guarantees a terminating null byte, so strlen() would be fine here. >> >> I preferred to double-check libc, but if you prefer to get rid of >> those assertions I'm okay with that. > > I don't think it's Spectrum's place to confirm libc is POSIX compliant. > Adding these checks isn't free, because it's more stuff to get past to > understand what's going on. Oh, that makes sense. >>>> + if (entry->d_name[0] == '.') >>>> + if (len == 1 || (len == 2 && entry->d_name[1] == '.')) >>>> + continue; >>>> + unsigned char c = (unsigned char)entry->d_name[0]; >>>> + if (!((c >= 'A' && c <= 'Z') || >>>> + (c >= 'a' && c <= 'z'))) >>>> + errx(EXIT_FAILURE, "Filename must begin with an ASCII letter"); >>> >>> Would the comparison not be valid without the cast? >> >> It would be in this case, but the subsequent cast to int in the error >> path assumes that the cast was done. Signed characters are much >> trickier and casting to unsigned makes the code easier to reason about. > > Is it safe to assume 'A' etc. are representable and comparable as > unsigned values? (I'm sure it is in practice, but I'd like it if I > could be confident this is being done strictly correctly.) I don't know if the C standard requires it, but we assume it. >>>> + if (entry->d_type != DT_REG) >>>> + errx(EXIT_FAILURE, "Entry contains non-regular file %s", entry->d_name); >>>> + } >>>> + closedir(d); >>>> +} >>>> + >>>> +int main(int argc, char **argv) >>>> +{ >>>> + for (int i = 1; i < argc; ++i) { >>> >>>> + int fd = open(argv[i], O_DIRECTORY|O_RDONLY|O_CLOEXEC|O_NOFOLLOW); >>> >>> Wasn't O_NOFOLLOW going to be removed here? >> >> It should never be called on a symlink, so if it does that is a bug >> in the caller. I can remove it if you prefer, though. > > I don't think it's the place of this program to put constraints on its > caller like that. Makes sense. -- Sincerely, Demi Marie Obenour (she/her/hers)