* NEWS: Mention the bug fix.
* src/comm.c (usage): Remove mention that FILE1 and FILE2 cannot both be
standard input.
(compare_files): Only close standard input once.
* doc/coreutils.texi (comm invocation): Document the behavior of
'comm - -' which is not portable to all implementations.
* tests/comm/dash-dash.sh: New file.
* tests/misc/comm.pl: Move to tests/comm/comm.pl.
* tests/local.mk (all_tests): Add the new test. Rename the existing
test.
This is for consistency with other implementations and since the
interface separates -b and -c it might in future support -g (graphemes).
Normalizing content with a filter seems like the most appropriate
approach anyway, as there are various normalizations possible including
case etc. rather than baking that into every tool
* doc/coreutils.texi (cut invocation): Add back the -d description,
and adjust for multi-byte support, and expand on specifying a NUL
delimitier, and detail the behavior when the delimiter matches
the line delimiter.
* doc/coreutils.texi (cut invocation): State explicitly that
-s --whitespace-delimited=trimmed will suppress lines that
do not have field separating blanks.
* src/cut.c (usage): Mention blank characters are used to separate.
* doc/coreutils.texi (cut invocation): Likewise. Also describe
the 'trimmed' argument and the relation to -F.
To improve compatibility with toybox/busybox scripts.
* doc/coreutils.texi (cut invocation): Add -O description.
* src/cut.c: Support -O as well as --output-delimiter
* tests/cut/cut.pl: Adjust one case to use -O.
Both the i18n patch and FreeBSD/macOS support this option.
They do differ in behavior somewhat as the i18n patch
may output more bytes than requested.
$ printf '\xc3\xa9b\n' | i18n-cut -n -b1
é
There is also a bug in the i18n patch with multi-byte
at the start of a line:
$ printf '\xc3\xa9b\n' | i18n-cut -n -b1-2
éb
We follow the FreeBSD behavior since it seems more
useful to have -b be a hard limit, rather than a soft limit.
This also reduces the possibility of duplicate character output
with separate cut invocations with non overlapping byte ranges.
* src/cut.c (cut_bytes_no_split): A new function
similar to cut_characters, to handle multi-byte characters
with byte limit semantics.
* tests/cut/cut.pl: Add test cases.
* doc/coreutils.texi (tty invocation): Mention that POSIX.1-2001 removed
the -s option and that portable scripts can redirect standard out to
/dev/null instead.
cksum --check is often the first interaction
users have with possibly untrusted downloads, so we should try
to be as defensive as possible when processing it.
Specifically we currently only escape \n characters in file names
presented in checksum files being parsed with cksum --check.
This gives some possibilty of dumping arbitrary data to the terminal
when checking downloads from an untrusted source.
This change gives these advantages:
1. Avoids dumping arbitrary data to vulnerable terminals
2. Avoids visual deception with ansi codes hiding checksum failures
3. More secure if users copy and paste file names from --check output
4. Simplifies programmatic parsing
Note this changes programmatic parsing, but given the original
format was so awkward to parse, I expect that's extremely rare.
I was not able to find example in the wild at least.
To parse the new format from from shell, you can do something like:
cksum -c checksums | while IFS= read -r line; do
case $line in
*': FAILED')
filename=$(eval "printf '%s' ${line%: FAILED}")
cp -v "$filename" /quarantine
;;
esac
done
This change also slightly reduces the size of the sum(1) utility.
This change also apples to md5sum, sha*sum, and b2sum.
* src/cksum.c (digest_check): Call quotef() instead of
cksum(1) specific quoting.
* tests/cksum/md5sum-bsd.sh: Adjust accordingly.
* doc/coreutils.texi (cksum general options): Describe the
shell quoting used for problematic file names.
* NEWS: Mention the change in behavior.
Reported by: Aaron Rainbolt
Cleartext signatures have many gotchas. Therefore, the use of detached
signatures is recommended where possible. See:
<https://gnupg.org/blog/20251226-cleartext-signatures.html>.
* doc/coreutils.texi (tee invocation): Adjust gpg invocation to produce
a detached signature.
Here is an example of the performance improvement:
$ yes abcdefghijklmnopqrstuvwxyz | head -n 100000000 > input
$ time ./src/wc-prev -l < input
100000000
real 0m0.793s
user 0m0.630s
sys 0m0.162s
$ time ./src/wc -l < input
100000000
real 0m0.230s
user 0m0.065s
sys 0m0.164s
* NEWS: Mention the performance improvement.
* gnulib: Update to the latest commit.
* configure.ac: Check the the necessary intrinsics and functions.
* src/local.mk (noinst_LIBRARIES) [USE_NEON_WC_LINECOUNT]: Add
src/libwc_neon.a.
(src_libwc_neon_a_SOURCES, wc_neon_ldadd, src_libwc_neon_a_CFLAGS)
[USE_NEON_WC_LINECOUNT]: New variables.
(src_wc_LDADD) [USE_NEON_WC_LINECOUNT]: Add $(wc_neon_ldadd).
* src/wc.c [USE_NEON_WC_LINECOUNT]: Include sys/auxv.h and asm/hwcap.h.
(neon_supported) [USE_NEON_WC_LINECOUNT]: New function.
(wc_lines) [USE_NEON_WC_LINECOUNT]: Use neon_supported and
wc_lines_neon.
* src/wc.h (wc_lines_neon): Add declaration.
* src/wc_neon.c: New file.
* doc/coreutils.texi (Hardware Acceleration): Document the "-ASIMD"
hwcap and the variable used in ./configure to override detection of Neon
instructions.
* tests/wc/wc-cpu.sh: Also add "-ASIMD" to disable the use of Neon
instructions.
Font problem reported by Michael Aramini via Alejandro Colomar
<https://bugs.gnu.org/80258>. This patch also fixes some
longstanding confusion with date synopses.
* src/date.c (usage): Do not imply that only -u can be used with
MMDDhhmm..., and do not put misleading brackets around the latter.
This is a more standard mechanism to disable markup.
* src/system.h (oputs_): Logic change to honor TERM=dumb,
rather than HELP_NO_MARKUP=something.
* doc/coreutils.texi: Adjust the description for --help.
* man/local.mk: Ensure TERM is set to something,
so that man pages have links included.
* man/viewman: Just honor users $TERM.
* tests/misc/getopt_vs_usage.sh: Remove env var complication,
as TERM is unset automatically.
* tests/misc/usage_vs_refs.sh: Likewise.
* NEWS: Adjust the change in behavior note.
* src/cp (usage): The -HLP options are close
in functionality and close alphabetically, so describe together.
* doc/coreutils.texi (cp invocation): Likewise.
* doc/coreutils.texi: Add missing anchors.
* src/pr.c (Usage): Adjust to use -COLS, to avoid a clash
with the additional anchor added to the manual.
Also markup the --columns option as done for other options.
* tests/split/line-bytes.sh: Also fix --lines-bytes typo here.
* src/ls.c (oputs): A new function that wraps puts(),
but also highlights the --option-text portion, and
adds links to the appropriate part of the online manual.
(usage): Call oputs() rather than puts().
* doc/coreutils.texi (--help): Document new HELP_NO_MARKUP env var,
which can be used in the edge case one wants to suppress ansi escapes.
* tests/misc/getopt_vs_usage.sh: Use HELP_NO_MARKUP to ensure the
test continues to pass.
* src/paste.c (usage): Mention how lines are processed
with and without the -s option. Also mention that -d
supports backslash escapes.
* doc/coreutils.texi (paste invocation): Likewise.
Also detail the backslash escapes, noting which are non-POSIX.
Align the -t implementation with the Heirloom project.
* src/ptx.c (usage): Describe -t, and also mention
the default width is 72 when not used.
* doc/coreutils.texi (ptx invocation): Likewise.
(main): Override the default width if -t is specified.
* tests/ptx/ptx.pl: Add test cases.
* NEWS: Mention the change in behavior.
* doc/coreutils.texi (Introduction): Use ç instead of @,{c}.
(Character arrays): Use ö instead of @"o. Use Ł instead of @L{}.
(Formatting file timestamps): Use ä instead of @"a.
Following commit v9.3-92-g1b86b70dd
$TMPDIR is part of the interface and an important behavioral
characteristic of a command, which should be documented.
* doc/coreutils.texi (split invocation): Mention $TMPDIR is honored.
(tac invocation): Likewise.
* src/split.c (usage): Likewise.
* src/tac.c (usage): Likewise.
* doc/coreutils.texi (dd invocation): Document the behavior of 'dd' on
multibyte characters and some unspecified behavior that will be
documented in a future POSIX release [1].
[1] https://austingroupbugs.net/view.php?id=1959
* doc/coreutils.texi (quotingStyles): Expand on the advantages
of "shell-escape" quoting, and mention it's the default when
outputting to a tty. Also mention how it's also useful with
LC_ALL=C to further disambiguate output. Also reference the
separate page detailing various considerations and options
for file name quoting. Also move the mention of the default
quoting style to the top of the page where it's more obvious.
* doc/coreutils.texi: Explicitly supply empty arguments to macros,
as dvi (a required prerequisite to pdf) is more strict in its
handling of macro arguments.
* cfg.mk (sc_texi_ensure_empty_option_args): Add a syntax check,
since this is not verified in the default build.
Reported by Collin Funk.
* doc/coreutils.texi (optAnchor): A new macro to output a
referencable anchor, called from ...
(optItem): ... here; a new macro to output all index entries
for each option item.
(optZero,optZeroTerminated): Show an example of the adjustment
done to each option description.
* doc/local.mk (html-local): Post-process the texinfo generated HTML
(`make html`) to remove our "-option" tag, and replace all
escaped _002d with a standard hyphen, which is fine in URLs.