94 Commits

Author SHA1 Message Date
oech3 dc9ca179f3 cmp: use .map_err 2026-06-10 15:12:36 +02:00
oech3 f29e96cdba cmp.rs: simplify by .ok_or 2026-06-10 10:21:25 +02:00
Marc Herbert ec3428b48f tests: fix "gpatch --version" error message on mac (#235)
When we fail to find `gpatch`, don't say that we failed to find `patch`.

Cosmetic fix to commit 1254f146f8 ("tests: validate "patch" and "ed"
commands once, print meaningful messages (#226)")

This is an extremely minor fix because the error message already printed
"gpatch validation failed, no such file or directory" even before this
commit.

Signed-off-by: Marc Herbert <Marc.Herbert@gmail.com>
2026-06-03 14:11:51 +02:00
oech3 58da229c09 support prefixed names (#231) 2026-06-03 14:11:29 +02:00
Marc Herbert 1254f146f8 tests: validate "patch" and "ed" commands once, print meaningful messages (#226)
macOS' /usr/bin/patch and GNU patch have very subtle incompatibilities
that cause only some "more advanced" tests to fail in obscure and very
time-consuming ways - while other tests pass. In some cases (depending
on test threads racing), the lack of newlines in some test data even
causes the whole test suite to stall.

This fix runs `patch -version` (only once), makes sure the output starts
with "GNU patch" and shows a meaningful assert message when not. It also
looks for `gpatch` instead of `patch` on macOS and shows a meaningful
assert message if either is missing.

Fixes: #225

This also provides faster and better feedback when `ed` is missing (see
#39) and implements a portable and basic check.

Last but not least, this new code is generic enough to support the
validation of any other test dependency in the future.
2026-05-24 17:08:52 +02:00
Gunter Schmidt d33aca1fff cmp Feat: change data type for 'bytes' limit and 'ignore initial' to u64 (#183)
* feat: u64 for --bytes and --ignore-initial

fix: bumped up tempfile to "3.26.0"

The variables for --bytes, --ignore-initial and line count where size 'usize',
thus limiting the readable bytes on 32-bit systems.
GNU cmp is compiled with LFS (Large File Support) and allows i64 values.

This is now all u64, which works also on 32-bit systems with Rust.
There is no reason to implement a 32-bit barrier for 32 bit machines.

Additionally the --bytes limit can be set to 'u128' using the feature
"cmp_bytes_limit_128_bit".

The performance impact would be negligible, as there only few calculations
each time a full block is read from the file.


---------

Co-authored-by: Gunter Schmidt <gsgit@beadsoft.de>
Co-authored-by: Sylvestre Ledru <sylvestre@debian.org>
2026-05-14 23:13:02 +02:00
Marc Herbert d73fa831b0 tests: fix "old" names in generated patch files
Fixes #223. Very simple reproduction

```
cd diffutils
mkdir a
touch a/alef  a/alefn  a/alef_  a/alefx  a/alefr  a/fuzz.file
cargo test
```
 => fail

https://www.gnu.org/software/diffutils/manual/html_node/Multiple-Patches.html
states that the "old" file name has precedence over the "new" filename.

I hit this problem because some other (and unfortunately: unknown for
now) test issue left bogus `a/alef*` file(s) behind in my workspace. I
didn't bother cleaning them up because I assumed some test would keep
recreating them and that cost me a lot of time.

This issue seems to have existed since the very first commit.
Interestingly, there as a previous attempt in 2024 to fix this in commit
a3a372ff36 ! So I was apparently not the only affected. BUT that
fix was immediately reverted by commit ba7cb0aef9 in the same
PR. Admittedly, that fix seemed somewhat off-topic in
https://github.com/uutils/diffutils/pull/33. So here it is again.
2026-05-13 14:02:35 +02:00
viju a340afb6d1 fix build failure on wasm32-wasip1 target (#215)
Co-authored-by: viju <pocopepe@vijus-MacBook-Air.local>
2026-04-18 19:32:32 +02:00
Ryuji Yasukochi 9dcca24fb0 fix: match GNU error format for unrecognized options (#180)
* fix: match GNU error format for unrecognized options

Use single quotes and remove colon to match GNU diff/cmp output:
`unrecognized option '--foobar'` instead of `unrecognized option: "--foobar"`

Also use `contains` instead of `starts_with` in the integration test
to handle the command prefix (e.g. `cmp: unrecognized option ...`).

Follow-up to #178 / #179.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* style: apply cargo fmt formatting

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-10 11:48:58 +01:00
Aster Boese 357c99038f cmp: fix 32-bit usize overflow in test (#173)
Fixes https://github.com/uutils/diffutils/issues/172
2026-03-07 18:35:57 +01:00
Ryuji Yasukochi 6f082c6572 fix: rename "Unknown option" to "unrecognized option" for diff and cmp (#179) 2026-02-28 13:43:56 +01:00
Gustavo Noronha Silva e00ff6b108 cmp: stop allocating for byte printing
This makes verbose comparison of 37MB completely different files 2.34x
faster than our own baseline, putting our cmp at almost 6x faster than
GNU cmp (/opt/homebrew/bin/cmp) on my M4 Pro Mac. The output remains
identical to that of GNU cmp. Mostly equal and smaller files do not
regress.

Benchmark 1: ./bin/baseline/diffutils cmp -lb t/huge t/eguh
  Time (mean ± σ):      1.669 s ±  0.011 s    [User: 1.594 s, System: 0.073 s]
  Range (min … max):    1.654 s …  1.689 s    10 runs

  Warning: Ignoring non-zero exit code.

Benchmark 2: ./target/release/diffutils cmp -lb t/huge t/eguh
  Time (mean ± σ):     714.2 ms ±   4.1 ms    [User: 629.3 ms, System: 82.7 ms]
  Range (min … max):   707.2 ms … 721.5 ms    10 runs

  Warning: Ignoring non-zero exit code.

Benchmark 3: /opt/homebrew/bin/cmp -lb t/huge t/eguh
  Time (mean ± σ):      4.213 s ±  0.050 s    [User: 4.128 s, System: 0.081 s]
  Range (min … max):    4.160 s …  4.316 s    10 runs

  Warning: Ignoring non-zero exit code.

Benchmark 4: /usr/bin/cmp -lb t/huge t/eguh
  Time (mean ± σ):      3.892 s ±  0.048 s    [User: 3.819 s, System: 0.070 s]
  Range (min … max):    3.808 s …  3.976 s    10 runs

  Warning: Ignoring non-zero exit code.

Summary
  ./target/release/diffutils cmp -lb t/huge t/eguh ran
    2.34 ± 0.02 times faster than ./bin/baseline/diffutils cmp -lb t/huge t/eguh
    5.45 ± 0.07 times faster than /usr/bin/cmp -lb t/huge t/eguh
    5.90 ± 0.08 times faster than /opt/homebrew/bin/cmp -lb t/huge t/eguh
2026-01-02 11:21:48 -03:00
LunarEclipse a09dcac41d Fix compilation for 32-bit targets 2025-12-22 00:04:48 +01:00
Daniel Hofstetter b59d9be943 clippy: fix warnings from unnecessary_unwrap lint 2025-08-08 11:48:58 +02:00
Daniel Hofstetter 7df02399ba clippy: fix warning from ptr_arg lint 2025-06-27 10:52:37 +02:00
Daniel Hofstetter 03fe614087 clippy: fix warnings from useless_format lint 2025-06-27 10:50:06 +02:00
Daniel Hofstetter 8261d790f4 clippy: fix warnings from uninlined_format_args 2025-06-27 10:45:40 +02:00
Sami Daniel (Tsoi) 1ef6923b7d Add side by side diff (partial)
Create the diff -y utility, this time introducing tests and changes focused
    mainly on the construction of the utility and issues related to alignment
    and response tabulation. New parameters were introduced such as the size
    of the total width of the output in the parameters. A new calculation was
    introduced to determine the size of the output columns and the maximum
    total column size. The tab and spacing mechanism has the same behavior
     as the original diff, with tabs and spaces formatted in the same way.

    - Introducing tests for the diff 'main' function
    - Introducing fuzzing for side diff utility
    - Introducing tests for internal mechanisms
    - Modular functions that allow consistent changes across the entire project
2025-06-02 22:33:04 -03:00
Sami Daniel 8105420bb4 Create the side-by-side option (-y) feature for the diff command (Incomplete).
- Create the function, in the utils package, limited_string that allows you to truncate a string based on a
delimiter (May break the encoding of the character where it was cut)

- Create tests for limited_string function

- Add support for -y and --side-by-side flags that enables diff output for side-by-side mode

- Create implementation of the diff -y (SideBySide) command, base command for sdiff, using the crate
diff as engine. Currently it does not fully represent GNU diff -y, some flags (|, (, ), , /) could
not be developed due to the limitation of the engine we currently use (crate diff), which did not
allow perform logic around it. Only the use of '<' and '>' were enabled.

- Create tests for SideBySide implementation
2025-06-02 22:32:11 -03:00
Gustavo Noronha Silva a316262603 cmp: print verbose diffs as we find them
Before this change, we would first find all changes so we could obtain
the largest offset we will report and use that to set up the padding.

Now we use the file sizes to estimate the largest possible offset.
Not only does this allow us to print earlier, reduces memory usage, as
we do not store diffs to report later, but it also fixes a case in
which our output was different to GNU cmp's - because it also seems
to estimate based on size.

Memory usage drops by a factor of 1000(!), without losing performance
while comparing 2 binaries of hundreds of MBs:

Before:

 Maximum resident set size (kbytes): 2489260

 Benchmark 1: ../target/release/diffutils \
 cmp -l -b /usr/lib64/chromium-browser/chromium-browser /usr/lib64/firefox/libxul.so
   Time (mean ± σ):     14.466 s ±  0.166 s    [User: 12.367 s, System: 2.012 s]
   Range (min … max):   14.350 s … 14.914 s    10 runs

After:

 Maximum resident set size (kbytes): 2636

 Benchmark 1: ../target/release/diffutils \
 cmp -l -b /usr/lib64/chromium-browser/chromium-browser /usr/lib64/firefox/libxul.so
   Time (mean ± σ):     13.724 s ±  0.038 s    [User: 12.263 s, System: 1.372 s]
   Range (min … max):   13.667 s … 13.793 s    10 runs
2024-10-08 08:28:06 -03:00
Gustavo Noronha Silva fac8dab182 cmp: completely avoid Rust fmt in verbose mode
This makes the code less readable, but gets us a massive improvement
to performance. Comparing ~36M completely different files now takes
~40% of the time. Compared to GNU cmp, we now run the same comparison
in ~26% of the time.

This also improves comparing binary files. A comparison of chromium
and libxul now takes ~60% of the time. We also beat GNU cmpi by about
the same margin.

Before:

 > hyperfine --warmup 1 -i --output=pipe \
     '../target/release/diffutils cmp -l huge huge.3'
 Benchmark 1: ../target/release/diffutils cmp -l huge huge.3
   Time (mean ± σ):      2.000 s ±  0.016 s    [User: 1.603 s, System: 0.392 s]
   Range (min … max):    1.989 s …  2.043 s    10 runs

   Warning: Ignoring non-zero exit code.

 > hyperfine --warmup 1 -i --output=pipe \
     '../target/release/diffutils cmp -l -b \
     /usr/lib64/chromium-browser/chromium-browser \
     /usr/lib64/firefox/libxul.so'
 Benchmark 1: ../target/release/diffutils cmp -l -b /usr/lib64/chromium-browser/chromium-browser /usr/lib64/firefox/libxul.so
   Time (mean ± σ):     24.704 s ±  0.162 s    [User: 21.948 s, System: 2.700 s]
   Range (min … max):   24.359 s … 24.889 s    10 runs

   Warning: Ignoring non-zero exit code.

After:

 > hyperfine --warmup 1 -i --output=pipe \
     '../target/release/diffutils cmp -l huge huge.3'
 Benchmark 1: ../target/release/diffutils cmp -l huge huge.3
   Time (mean ± σ):     849.5 ms ±   6.2 ms    [User: 538.3 ms, System: 306.8 ms]
   Range (min … max):   839.4 ms … 857.7 ms    10 runs

   Warning: Ignoring non-zero exit code.

 > hyperfine --warmup 1 -i --output=pipe \
     '../target/release/diffutils cmp -l -b \
     /usr/lib64/chromium-browser/chromium-browser \
     /usr/lib64/firefox/libxul.so'
 Benchmark 1: ../target/release/diffutils cmp -l -b /usr/lib64/chromium-browser/chromium-browser /usr/lib64/firefox/libxul.so
   Time (mean ± σ):     14.646 s ±  0.040 s    [User: 12.328 s, System: 2.286 s]
   Range (min … max):   14.585 s … 14.702 s    10 runs

   Warning: Ignoring non-zero exit code.
2024-10-01 13:30:57 -03:00
Gustavo Noronha Silva 2e681301b4 cmp: avoid using advanced rust formatting for -l
Octal conversion and simple integer to string both show up in profiling.
This change improves comparing ~36M completely different files wth both
-l and -b by ~11-13%.
2024-10-01 13:30:57 -03:00
Gustavo Noronha Silva 50057412bd Add cmp utility
The utility should support all the arguments supported by GNU cmp and
perform slightly better.

On a "bad" scenario, ~36M files which are completely different, our
version runs in ~72% of the time of the original on my M1 Max:

 > hyperfine --warmup 1 -i --output=pipe \
     'cmp -l huge huge.3'
 Benchmark 1: cmp -l huge huge.3
   Time (mean ± σ):      3.237 s ±  0.014 s    [User: 2.891 s, System: 0.341 s]
   Range (min … max):    3.221 s …  3.271 s    10 runs

   Warning: Ignoring non-zero exit code.

 > hyperfine --warmup 1 -i --output=pipe \
     '../target/release/diffutils cmp -l huge huge.3'
 Benchmark 1: ../target/release/diffutils cmp -l huge huge.3
   Time (mean ± σ):      2.392 s ±  0.009 s    [User: 1.978 s, System: 0.406 s]
   Range (min … max):    2.378 s …  2.406 s    10 runs

   Warning: Ignoring non-zero exit code.

Our cmp runs in ~116% of the time when comparing libxul.so to the
chromium-browser binary with -l and -b. In a best case scenario of
comparing 2 files which are the same except for the last byte, our
tool is slightly faster.
2024-10-01 13:30:57 -03:00
Gustavo Noronha Silva 72c7802f06 Take utility name as first parameter on diffutils
This is in preparation for adding the other diffutils commands, cmp,
diff3, sdiff.

We use a similar strategy to uutils/coreutils, with the single binary
acting as one of the supported tools if called through a symlink with
the appropriate name. When using the multi-tool binary directly, the
utility needds to be the first parameter.
2024-09-26 21:22:24 -03:00
Olivier Tilloy d8b91fd60e Update unit test expectation 2024-09-19 22:33:33 +02:00
Daniel Hofstetter 2a899a9fc7 Fix clippy warnings in tests
from needless_borrows_for_generic_args lint
2024-09-06 09:27:53 +02:00
Olivier Tilloy fa4e0c6097 Make error message consistent with GNU diff's implementation when failing to read input file(s) 2024-06-04 14:57:50 +02:00
Tanmay Patil 8c6a648aef Merge branch 'main' into handle-directory-input 2024-04-23 23:11:31 +05:30
Olivier Tilloy b7261a43f4 Break out the logic to match context/unified diff params into separate functions, for improved readability 2024-04-22 18:01:00 +02:00
Olivier Tilloy 37fe1ae808 Handle --normal, -e and --ed options 2024-04-22 18:01:00 +02:00
Olivier Tilloy 22d973fce6 Parse all valid arguments accepted by GNU diff to request a regular context (with an optional number of lines) 2024-04-22 18:01:00 +02:00
Olivier Tilloy fe28610f21 Parse all valid arguments accepted by GNU diff to request a unified context (with an optional number of lines) 2024-04-22 18:01:00 +02:00
Sylvestre Ledru 3a8eddfe2c Fix typos 2024-04-21 16:07:01 +02:00
Tanmay Patil 39d2ece187 Handle directory-file and file-directory comparisons in the diff
GNU diff treats `diff DIRECTORY FILE` as `diff DIRECTORY/FILE FILE`
2024-04-21 16:10:48 +05:30
Olivier Tilloy 14799eea89 Move test assertions in the cfg block where they belong 2024-04-21 00:13:52 +02:00
Olivier Tilloy 831348d1fc Fix file path in ed diff tests 2024-04-21 00:12:43 +02:00
Tanmay Patil aedd0684d1 Replace only the first two occurences of timestamp regex 2024-04-16 10:41:38 +05:30
Tanmay Patil 54c02bdf0b Use NamedTempFile instead of manually creating files 2024-04-16 10:17:09 +05:30
Tanmay Patil ba7cb0aef9 Do not create dummy files
Since we now returning SystemTime::now() for invalid file input,
there is no need to crate dummy files
2024-04-14 22:56:37 +05:30
Tanmay Patil 33783d094e Improve tests 2024-04-14 17:16:53 +05:30
Tanmay Patil 900e1c3a68 Tests: Replace modification time in diff with "TIMESTAMP" placeholder 2024-04-14 13:43:30 +05:30
Tanmay Patil 0a77fe12b9 Add tests for get_modification_time function 2024-04-13 21:31:13 +05:30
Tanmay Patil 86bd05c739 Merge branch 'context-diff-modification-time' of github.com:TanmayPatil105/diffutils into context-diff-modification-time 2024-04-10 22:31:09 +05:30
Tanmay Patil 00e18a6b0c Define assert_diff_eq macro for context&unified diff comparison 2024-04-10 22:20:48 +05:30
Tanmay f6eb0835b0 Merge branch 'main' into context-diff-modification-time 2024-04-10 22:13:18 +05:30
Olivier Tilloy 84ad116845 Use io::stdin() to read from standard input in a portable manner 2024-04-08 20:21:24 +02:00
Olivier Tilloy 6dc34fed44 Handle the rewrite of "-" to "/dev/stdin" in main to leave the filenames unchanged (fixes #46) 2024-04-08 20:21:24 +02:00
Olivier Tilloy c325291696 Unit test to verify that conflicting output styles result in an error 2024-04-05 23:22:26 +02:00
Tanmay Patil 72da7fca40 Show current time if fs::metadata errors 2024-04-04 20:01:11 +05:30
Tanmay 61fb0657c1 Merge branch 'main' into context-diff-modification-time 2024-04-04 19:56:13 +05:30