The gnulib gitweb server returns a 302 redirect, but curl was called
without -L so it saved the HTML redirect page instead of init.sh.
This caused all 33 GNU upstream tests to fail in CI since the init.sh
fetch was introduced in c1b66e4.
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
* fix: match GNU error format for unrecognized options
Use single quotes and remove colon to match GNU diff/cmp output:
`unrecognized option '--foobar'` instead of `unrecognized option: "--foobar"`
Also use `contains` instead of `starts_with` in the integration test
to handle the command prefix (e.g. `cmp: unrecognized option ...`).
Follow-up to #178 / #179.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* style: apply cargo fmt formatting
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
On my Mac I see this test fail quite consistently. This change makes it
more resilient in systems with slower startup times, while still
allowing faster systems to finish as soon as possible.
The test was failing in the regular MacOS terminal due to it defaulting
to LC_ALL=C. Best to standardize like the other tests that check for
locale-dependent output.
Before this change, we would first find all changes so we could obtain
the largest offset we will report and use that to set up the padding.
Now we use the file sizes to estimate the largest possible offset.
Not only does this allow us to print earlier, reduces memory usage, as
we do not store diffs to report later, but it also fixes a case in
which our output was different to GNU cmp's - because it also seems
to estimate based on size.
Memory usage drops by a factor of 1000(!), without losing performance
while comparing 2 binaries of hundreds of MBs:
Before:
Maximum resident set size (kbytes): 2489260
Benchmark 1: ../target/release/diffutils \
cmp -l -b /usr/lib64/chromium-browser/chromium-browser /usr/lib64/firefox/libxul.so
Time (mean ± σ): 14.466 s ± 0.166 s [User: 12.367 s, System: 2.012 s]
Range (min … max): 14.350 s … 14.914 s 10 runs
After:
Maximum resident set size (kbytes): 2636
Benchmark 1: ../target/release/diffutils \
cmp -l -b /usr/lib64/chromium-browser/chromium-browser /usr/lib64/firefox/libxul.so
Time (mean ± σ): 13.724 s ± 0.038 s [User: 12.263 s, System: 1.372 s]
Range (min … max): 13.667 s … 13.793 s 10 runs
The utility should support all the arguments supported by GNU cmp and
perform slightly better.
On a "bad" scenario, ~36M files which are completely different, our
version runs in ~72% of the time of the original on my M1 Max:
> hyperfine --warmup 1 -i --output=pipe \
'cmp -l huge huge.3'
Benchmark 1: cmp -l huge huge.3
Time (mean ± σ): 3.237 s ± 0.014 s [User: 2.891 s, System: 0.341 s]
Range (min … max): 3.221 s … 3.271 s 10 runs
Warning: Ignoring non-zero exit code.
> hyperfine --warmup 1 -i --output=pipe \
'../target/release/diffutils cmp -l huge huge.3'
Benchmark 1: ../target/release/diffutils cmp -l huge huge.3
Time (mean ± σ): 2.392 s ± 0.009 s [User: 1.978 s, System: 0.406 s]
Range (min … max): 2.378 s … 2.406 s 10 runs
Warning: Ignoring non-zero exit code.
Our cmp runs in ~116% of the time when comparing libxul.so to the
chromium-browser binary with -l and -b. In a best case scenario of
comparing 2 files which are the same except for the last byte, our
tool is slightly faster.
This is in preparation for adding the other diffutils commands, cmp,
diff3, sdiff.
We use a similar strategy to uutils/coreutils, with the single binary
acting as one of the supported tools if called through a symlink with
the appropriate name. When using the multi-tool binary directly, the
utility needds to be the first parameter.
* Implement -q/--brief option
* Optimization: stop analyzing the files as soon as there are any differences
* Unit tests for the stop_early parameter
* Simplify checks