Compare commits

..

64 Commits

Author SHA1 Message Date
renovate[bot] 487cf4679c chore(deps): update github artifact actions 2026-06-11 10:39:55 +00:00
oech3 dc9ca179f3 cmp: use .map_err 2026-06-10 15:12:36 +02:00
oech3 f29e96cdba cmp.rs: simplify by .ok_or 2026-06-10 10:21:25 +02:00
renovate[bot] a46dae68b1 chore(deps): update rust crate regex to v1.12.4 2026-06-10 07:20:17 +02:00
renovate[bot] 1a8d7f96a6 chore(deps): update codecov/codecov-action action to v7 (#240)
Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
2026-06-07 09:08:29 +02:00
renovate[bot] 53599ccd40 chore(deps): update rust crate libfuzzer-sys to v0.4.13 2026-06-05 07:24:05 +02:00
renovate[bot] 9bc64f03ed chore(deps): update rust crate chrono to v0.4.45 2026-06-05 07:17:21 +02:00
Marc Herbert d266f9b90e CI: remove echo "/opt/homebrew/opt/gpatch/libexec/gnubin" >> "$GITHUB_PATH" (#234)
This is not needed now that tests automatically look for `gpatch` on mac
since commit 1254f146f8 ("tests: validate "patch" and "ed" commands
once, print meaningful messages (#226)")

The fewer PATH changes, the better.

Signed-off-by: Marc Herbert <Marc.Herbert@gmail.com>
2026-06-03 14:13:10 +02:00
Marc Herbert ec3428b48f tests: fix "gpatch --version" error message on mac (#235)
When we fail to find `gpatch`, don't say that we failed to find `patch`.

Cosmetic fix to commit 1254f146f8 ("tests: validate "patch" and "ed"
commands once, print meaningful messages (#226)")

This is an extremely minor fix because the error message already printed
"gpatch validation failed, no such file or directory" even before this
commit.

Signed-off-by: Marc Herbert <Marc.Herbert@gmail.com>
2026-06-03 14:11:51 +02:00
oech3 58da229c09 support prefixed names (#231) 2026-06-03 14:11:29 +02:00
Sylvestre Ledru 250f935efe Add CONTRIBUTING.md pointing to the review guidelines 2026-05-30 10:07:54 +02:00
Marc Herbert 1254f146f8 tests: validate "patch" and "ed" commands once, print meaningful messages (#226)
macOS' /usr/bin/patch and GNU patch have very subtle incompatibilities
that cause only some "more advanced" tests to fail in obscure and very
time-consuming ways - while other tests pass. In some cases (depending
on test threads racing), the lack of newlines in some test data even
causes the whole test suite to stall.

This fix runs `patch -version` (only once), makes sure the output starts
with "GNU patch" and shows a meaningful assert message when not. It also
looks for `gpatch` instead of `patch` on macOS and shows a meaningful
assert message if either is missing.

Fixes: #225

This also provides faster and better feedback when `ed` is missing (see
#39) and implements a portable and basic check.

Last but not least, this new code is generic enough to support the
validation of any other test dependency in the future.
2026-05-24 17:08:52 +02:00
renovate[bot] c1943c5abb chore(deps): update rust crate divan to v4.7.0 2026-05-23 07:09:07 +02:00
Gunter Schmidt d33aca1fff cmp Feat: change data type for 'bytes' limit and 'ignore initial' to u64 (#183)
* feat: u64 for --bytes and --ignore-initial

fix: bumped up tempfile to "3.26.0"

The variables for --bytes, --ignore-initial and line count where size 'usize',
thus limiting the readable bytes on 32-bit systems.
GNU cmp is compiled with LFS (Large File Support) and allows i64 values.

This is now all u64, which works also on 32-bit systems with Rust.
There is no reason to implement a 32-bit barrier for 32 bit machines.

Additionally the --bytes limit can be set to 'u128' using the feature
"cmp_bytes_limit_128_bit".

The performance impact would be negligible, as there only few calculations
each time a full block is read from the file.


---------

Co-authored-by: Gunter Schmidt <gsgit@beadsoft.de>
Co-authored-by: Sylvestre Ledru <sylvestre@debian.org>
2026-05-14 23:13:02 +02:00
renovate[bot] 649179069c chore(deps): update dawidd6/action-download-artifact action to v21 2026-05-14 10:29:01 +02:00
Daniel Hofstetter 9fe96ed5e9 Make compare_test_results.py executable 2026-05-14 10:25:48 +02:00
pocopepe 2c47ea9f04 feat: add gnu testsuite tracking workflows 2026-05-14 09:42:26 +02:00
Daniel Hofstetter 9a7a727da4 Fix end of some files 2026-05-14 09:41:36 +02:00
Daniel Hofstetter a24b0c391e ci: add .pre-commit-config.yaml 2026-05-14 09:41:36 +02:00
xtqqczze 3f2c8678da actions: add security audit workflow 2026-05-14 09:41:25 +02:00
Marc Herbert d73fa831b0 tests: fix "old" names in generated patch files
Fixes #223. Very simple reproduction

```
cd diffutils
mkdir a
touch a/alef  a/alefn  a/alef_  a/alefx  a/alefr  a/fuzz.file
cargo test
```
 => fail

https://www.gnu.org/software/diffutils/manual/html_node/Multiple-Patches.html
states that the "old" file name has precedence over the "new" filename.

I hit this problem because some other (and unfortunately: unknown for
now) test issue left bogus `a/alef*` file(s) behind in my workspace. I
didn't bother cleaning them up because I assumed some test would keep
recreating them and that cost me a lot of time.

This issue seems to have existed since the very first commit.
Interestingly, there as a previous attempt in 2024 to fix this in commit
a3a372ff36 ! So I was apparently not the only affected. BUT that
fix was immediately reverted by commit ba7cb0aef9 in the same
PR. Admittedly, that fix seemed somewhat off-topic in
https://github.com/uutils/diffutils/pull/33. So here it is again.
2026-05-13 14:02:35 +02:00
renovate[bot] be90f75e68 chore(deps): update rust crate assert_cmd to v2.2.2 2026-05-12 09:13:23 +02:00
oech3 259e51d0d4 *.yml: dedup env: 2026-05-09 16:56:20 +02:00
xtqqczze da98437b08 chore: add COPYRIGHT file and update license references
- move copyright information from LICENSE-* files to COPYRIGHT
- use some improved wording used by the Rust project
2026-05-09 16:48:09 +02:00
xtqqczze 54db7b0b3e Add SECURITY.md
Copied from https://github.com/uutils/coreutils/blob/5e974797bd8050c2d425a706670254ad0323404d/SECURITY.md

Co-authored-by: Sylvestre Ledru <sylvestre@debian.org>
Co-authored-by: Daniel Hofstetter <daniel.hofstetter@42dh.com>
2026-05-09 16:44:14 +02:00
renovate[bot] c811142a6c chore(deps): update rust crate divan to v4.6.0 2026-04-28 16:05:06 +02:00
pocopepe 4043bb1928 skip --help args in fuzz_cmp_args since it causes process exit 2026-04-21 10:25:34 +02:00
pocopepe d11f672d29 fix: fuzz targets missing target dir and silent CI failures 2026-04-21 10:25:34 +02:00
oech3 37abce4eab Add CI for wasm32 (#218)
Co-authored-by: oech3 <>
2026-04-19 11:05:03 +02:00
viju a340afb6d1 fix build failure on wasm32-wasip1 target (#215)
Co-authored-by: viju <pocopepe@vijus-MacBook-Air.local>
2026-04-18 19:32:32 +02:00
renovate[bot] 18c5533b82 chore(deps): update rust crate assert_cmd to v2.2.1 2026-04-17 16:42:59 +02:00
renovate[bot] 34f3935b71 chore(deps): update rust crate divan to v4.5.0 2026-04-17 13:34:41 +02:00
renovate[bot] 904efda150 chore(deps): update softprops/action-gh-release action to v3 2026-04-12 10:20:51 +02:00
renovate[bot] af3e010b26 chore(deps): update rust crate rand to v0.10.1 2026-04-11 15:27:05 +02:00
xtqqczze 0001b2036e chore(deps): update rust crates 2026-04-11 08:06:26 +02:00
renovate[bot] 8aa2a2cb7c chore(deps): update codecov/codecov-action action to v6 2026-03-27 07:25:18 +01:00
Kevin Burke 23890b6c94 fix: follow redirects when fetching gnulib init.sh in upstream test suite (#202)
The gnulib gitweb server returns a 302 redirect, but curl was called
without -L so it saved the HTML redirect page instead of init.sh.
This caused all 33 GNU upstream tests to fail in CI since the init.sh
fetch was introduced in c1b66e4.

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-20 07:51:26 +01:00
renovate[bot] 5fc37c7c73 chore(deps): update rust crate itoa to v1.0.18 2026-03-20 06:05:18 +01:00
renovate[bot] f4895861db chore(deps): update rust crate divan to v4.4.1 2026-03-12 14:01:06 +01:00
renovate[bot] 25cad28b99 chore(deps): update rust crate divan to v4.4.0 2026-03-12 11:21:24 +01:00
renovate[bot] 454f5436ce chore(deps): update rust crate tempfile to v3.27.0 2026-03-11 06:13:15 +01:00
renovate[bot] 2efd4e17fa chore(deps): update rust crate assert_cmd to v2.2.0 2026-03-11 06:10:01 +01:00
Ryuji Yasukochi 9dcca24fb0 fix: match GNU error format for unrecognized options (#180)
* fix: match GNU error format for unrecognized options

Use single quotes and remove colon to match GNU diff/cmp output:
`unrecognized option '--foobar'` instead of `unrecognized option: "--foobar"`

Also use `contains` instead of `starts_with` in the integration test
to handle the command prefix (e.g. `cmp: unrecognized option ...`).

Follow-up to #178 / #179.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* style: apply cargo fmt formatting

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-10 11:48:58 +01:00
oech3 5660d0eafb Cargo.toml: Simplify profiles 2026-03-08 21:53:27 +01:00
renovate[bot] c624dc489d chore(deps): update moonrepo/setup-rust action to v1 2026-03-08 14:03:47 +01:00
oech3 bdf449eaf2 Publish binary from main (#163) 2026-03-07 22:04:19 +01:00
oech3 f8248801a9 fuzzing.yml: Avoid non reusable cache generation (#170) 2026-03-07 18:39:56 +01:00
Aster Boese 357c99038f cmp: fix 32-bit usize overflow in test (#173)
Fixes https://github.com/uutils/diffutils/issues/172
2026-03-07 18:35:57 +01:00
codspeed-hq[bot] 59e130aa22 Add CodSpeed performance benchmarking workflow and badge (#189)
Co-authored-by: codspeed-hq[bot] <117304815+codspeed-hq[bot]@users.noreply.github.com>
2026-03-07 15:12:18 +01:00
Gunter Schmidt 54c8b7aeb9 feat: Divan Benchmark (#185)
* feat: Criterion Benchmark

* fix: Replaced Criterion with codspeed drop-in replacement

* feat: uses Divan instead of Criterion

* changed file num lines to file size in kb

---------

Co-authored-by: Gunter Schmidt <gsgit@beadsoft.de>
2026-03-07 14:55:10 +01:00
Ryuji Yasukochi 6f082c6572 fix: rename "Unknown option" to "unrecognized option" for diff and cmp (#179) 2026-02-28 13:43:56 +01:00
renovate[bot] 34db0ade7c chore(deps): update rust crate tempfile to v3.26.0 2026-02-24 10:29:37 +01:00
renovate[bot] d3d0b0c966 chore(deps): update rust crate chrono to v0.4.44 2026-02-23 14:00:09 +01:00
renovate[bot] 87e0aa2828 chore(deps): update rust crate predicates to v3.1.4 2026-02-12 06:27:45 +01:00
renovate[bot] 9f419c31ea chore(deps): update rust crate libfuzzer-sys to v0.4.12 2026-02-10 23:23:31 +01:00
renovate[bot] 95883b462b chore(deps): update rust crate tempfile to v3.25.0 2026-02-10 06:08:40 +01:00
Daniel Hofstetter f20af97a09 Merge pull request #166 from uutils/renovate/regex-1.x-lockfile
chore(deps): update rust crate regex to v1.12.3
2026-02-03 17:06:50 +01:00
renovate[bot] b9b7ea8d2b chore(deps): update rust crate regex to v1.12.3 2026-02-03 14:47:31 +00:00
Sylvestre Ledru 47798b4b2c Merge pull request #164 from oech3/patch-2
Use preinstalled rust, disable incremental build
2026-01-25 17:22:11 +01:00
oech3 445e1ea02f Use preinstalled rust, disable incremental build 2026-01-25 11:58:26 +09:00
renovate[bot] e2fb192d52 chore(deps): update rust crate chrono to v0.4.43 2026-01-15 06:13:15 +01:00
renovate[bot] a1d18a0c09 chore(deps): update rust crate assert_cmd to v2.1.2 2026-01-09 19:40:57 +01:00
Sylvestre Ledru 5dd2e9d30c cmp: stop allocating for byte printing (#153)
This makes verbose comparison of 37MB completely different files 2.34x
faster than our own baseline, putting our cmp at almost 6x faster than
GNU cmp (/opt/homebrew/bin/cmp) on my M4 Pro Mac. The output remains
identical to that of GNU cmp. Mostly equal and smaller files do not
regress.

Benchmark 1: ./bin/baseline/diffutils cmp -lb t/huge t/eguh
  Time (mean ± σ):      1.669 s ±  0.011 s    [User: 1.594 s, System: 0.073 s]
  Range (min … max):    1.654 s …  1.689 s    10 runs

  Warning: Ignoring non-zero exit code.

Benchmark 2: ./target/release/diffutils cmp -lb t/huge t/eguh
  Time (mean ± σ):     714.2 ms ±   4.1 ms    [User: 629.3 ms, System: 82.7 ms]
  Range (min … max):   707.2 ms … 721.5 ms    10 runs

  Warning: Ignoring non-zero exit code.

Benchmark 3: /opt/homebrew/bin/cmp -lb t/huge t/eguh
  Time (mean ± σ):      4.213 s ±  0.050 s    [User: 4.128 s, System: 0.081 s]
  Range (min … max):    4.160 s …  4.316 s    10 runs

  Warning: Ignoring non-zero exit code.

Benchmark 4: /usr/bin/cmp -lb t/huge t/eguh
  Time (mean ± σ):      3.892 s ±  0.048 s    [User: 3.819 s, System: 0.070 s]
  Range (min … max):    3.808 s …  3.976 s    10 runs

  Warning: Ignoring non-zero exit code.

Summary
  ./target/release/diffutils cmp -lb t/huge t/eguh ran
    2.34 ± 0.02 times faster than ./bin/baseline/diffutils cmp -lb t/huge t/eguh
    5.45 ± 0.07 times faster than /usr/bin/cmp -lb t/huge t/eguh
    5.90 ± 0.08 times faster than /opt/homebrew/bin/cmp -lb t/huge t/eguh
2026-01-08 23:33:42 +01:00
Gustavo Noronha Silva e00ff6b108 cmp: stop allocating for byte printing
This makes verbose comparison of 37MB completely different files 2.34x
faster than our own baseline, putting our cmp at almost 6x faster than
GNU cmp (/opt/homebrew/bin/cmp) on my M4 Pro Mac. The output remains
identical to that of GNU cmp. Mostly equal and smaller files do not
regress.

Benchmark 1: ./bin/baseline/diffutils cmp -lb t/huge t/eguh
  Time (mean ± σ):      1.669 s ±  0.011 s    [User: 1.594 s, System: 0.073 s]
  Range (min … max):    1.654 s …  1.689 s    10 runs

  Warning: Ignoring non-zero exit code.

Benchmark 2: ./target/release/diffutils cmp -lb t/huge t/eguh
  Time (mean ± σ):     714.2 ms ±   4.1 ms    [User: 629.3 ms, System: 82.7 ms]
  Range (min … max):   707.2 ms … 721.5 ms    10 runs

  Warning: Ignoring non-zero exit code.

Benchmark 3: /opt/homebrew/bin/cmp -lb t/huge t/eguh
  Time (mean ± σ):      4.213 s ±  0.050 s    [User: 4.128 s, System: 0.081 s]
  Range (min … max):    4.160 s …  4.316 s    10 runs

  Warning: Ignoring non-zero exit code.

Benchmark 4: /usr/bin/cmp -lb t/huge t/eguh
  Time (mean ± σ):      3.892 s ±  0.048 s    [User: 3.819 s, System: 0.070 s]
  Range (min … max):    3.808 s …  3.976 s    10 runs

  Warning: Ignoring non-zero exit code.

Summary
  ./target/release/diffutils cmp -lb t/huge t/eguh ran
    2.34 ± 0.02 times faster than ./bin/baseline/diffutils cmp -lb t/huge t/eguh
    5.45 ± 0.07 times faster than /usr/bin/cmp -lb t/huge t/eguh
    5.90 ± 0.08 times faster than /opt/homebrew/bin/cmp -lb t/huge t/eguh
2026-01-02 11:21:48 -03:00
37 changed files with 2252 additions and 464 deletions
+100
View File
@@ -0,0 +1,100 @@
name: GnuComment
on:
workflow_run:
workflows: ["GnuTests"]
types:
- completed
permissions: {}
jobs:
post-comment:
permissions:
actions: read # to list workflow runs artifacts
pull-requests: write # to comment on pr
runs-on: ubuntu-latest
if: >
github.event.workflow_run.event == 'pull_request'
steps:
- name: 'Download artifact'
uses: actions/github-script@v9
with:
script: |
// List all artifacts from GnuTests
var artifacts = await github.rest.actions.listWorkflowRunArtifacts({
owner: context.repo.owner,
repo: context.repo.repo,
run_id: ${{ github.event.workflow_run.id }},
});
// Download the "comment" artifact, which contains a PR number (NR) and result.txt
var matchArtifact = artifacts.data.artifacts.filter((artifact) => {
return artifact.name == "comment"
})[0];
if (!matchArtifact) {
console.log('No comment artifact found');
return;
}
var download = await github.rest.actions.downloadArtifact({
owner: context.repo.owner,
repo: context.repo.repo,
artifact_id: matchArtifact.id,
archive_format: 'zip',
});
var fs = require('fs');
fs.writeFileSync('${{ github.workspace }}/comment.zip', Buffer.from(download.data));
- run: unzip comment.zip || echo "Failed to unzip comment artifact"
- name: 'Comment on PR'
uses: actions/github-script@v9
with:
github-token: ${{ secrets.GITHUB_TOKEN }}
script: |
var fs = require('fs');
// Check if files exist
if (!fs.existsSync('./NR')) {
console.log('No NR file found, skipping comment');
return;
}
if (!fs.existsSync('./result.txt')) {
console.log('No result.txt file found, skipping comment');
return;
}
var issue_number = Number(fs.readFileSync('./NR'));
var content = fs.readFileSync('./result.txt');
if (content.toString().trim().length > 7) { // 7 because we have backquote + \n
// Update existing comment if present, otherwise create a new one
var marker = '<!-- gnu-tests-bot -->';
var body = marker + '\nGNU diffutils testsuite comparison:\n```\n' + content + '```';
var comments = await github.rest.issues.listComments({
owner: context.repo.owner,
repo: context.repo.repo,
issue_number: issue_number,
});
var existing = comments.data.filter(c => c.body.includes(marker))[0];
if (existing) {
await github.rest.issues.updateComment({
owner: context.repo.owner,
repo: context.repo.repo,
comment_id: existing.id,
body: body,
});
} else {
await github.rest.issues.createComment({
owner: context.repo.owner,
repo: context.repo.repo,
issue_number: issue_number,
body: body,
});
}
} else {
console.log('Comment content too short, skipping');
}
+231
View File
@@ -0,0 +1,231 @@
name: GnuTests
# Run GNU diffutils testsuite against the Rust diffutils implementation
# and compare results against the main branch to catch regressions
on:
pull_request:
push:
branches:
- '*'
permissions:
contents: write # Publish diffutils instead of discarding
# End the current execution if there is a new changeset in the PR
concurrency:
group: ${{ github.workflow }}-${{ github.ref }}
cancel-in-progress: ${{ github.ref != 'refs/heads/main' }}
env:
DEFAULT_BRANCH: ${{ github.event.repository.default_branch }}
TEST_FULL_SUMMARY_FILE: 'diffutils-gnu-full-result.json'
jobs:
native:
name: Run GNU diffutils testsuite
runs-on: ubuntu-24.04
steps:
- name: Checkout code
uses: actions/checkout@v4
with:
persist-credentials: false
- uses: dtolnay/rust-toolchain@master
with:
toolchain: stable
- uses: Swatinem/rust-cache@v2
### Build
- name: Build Rust diffutils binary
shell: bash
run: |
## Build Rust diffutils binary
cargo build --config=profile.release.strip=true --profile=release
zstd -19 target/release/diffutils -o diffutils-x86_64-unknown-linux-gnu.zst
- name: Publish latest commit
uses: softprops/action-gh-release@v3
if: github.event_name == 'push' && github.ref == 'refs/heads/main'
with:
tag_name: latest-commit
body: |
commit: ${{ github.sha }}
draft: false
prerelease: true
files: |
diffutils-x86_64-unknown-linux-gnu.zst
env:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
### Run tests
- name: Run GNU diffutils testsuite
shell: bash
run: |
## Run GNU diffutils testsuite
./tests/run-upstream-testsuite.sh release || true
env:
TERM: xterm
- name: Upload full json results
uses: actions/upload-artifact@v7
with:
name: diffutils-gnu-full-result
path: tests/test-results.json
if-no-files-found: warn
aggregate:
needs: [native]
permissions:
actions: read
contents: read
pull-requests: read
name: Aggregate GNU test results
runs-on: ubuntu-24.04
steps:
- name: Initialize workflow variables
id: vars
shell: bash
run: |
## VARs setup
outputs() { step_id="${{ github.action }}"; for var in "$@" ; do echo steps.${step_id}.outputs.${var}="${!var}"; echo "${var}=${!var}" >> $GITHUB_OUTPUT; done; }
TEST_SUMMARY_FILE='diffutils-gnu-result.json'
outputs TEST_SUMMARY_FILE
- name: Checkout code
uses: actions/checkout@v4
with:
persist-credentials: false
- name: Retrieve reference artifacts
uses: dawidd6/action-download-artifact@v21
continue-on-error: true
with:
workflow: GnuTests.yml
branch: "${{ env.DEFAULT_BRANCH }}"
workflow_conclusion: completed
path: "reference"
if_no_artifact_found: warn
- name: Download full json results
uses: actions/download-artifact@v8
with:
name: diffutils-gnu-full-result
path: results
- name: Extract/summarize testing info
id: summary
shell: bash
run: |
## Extract/summarize testing info
outputs() { step_id="${{ github.action }}"; for var in "$@" ; do echo steps.${step_id}.outputs.${var}="${!var}"; echo "${var}=${!var}" >> $GITHUB_OUTPUT; done; }
RESULT_FILE="results/test-results.json"
if [[ ! -f "$RESULT_FILE" ]]; then
echo "::error ::Missing test results at $RESULT_FILE"
exit 1
fi
TOTAL=$(jq '[.tests[]] | length' "$RESULT_FILE")
PASS=$(jq '[.tests[] | select(.result=="PASS")] | length' "$RESULT_FILE")
FAIL=$(jq '[.tests[] | select(.result=="FAIL")] | length' "$RESULT_FILE")
SKIP=$(jq '[.tests[] | select(.result=="SKIP")] | length' "$RESULT_FILE")
ERROR=0
output="GNU diffutils tests summary = TOTAL: $TOTAL / PASS: $PASS / FAIL: $FAIL / SKIP: $SKIP"
echo "${output}"
if [[ "$FAIL" -gt 0 ]]; then
echo "::warning ::${output}"
fi
jq -n \
--arg date "$(date --rfc-email)" \
--arg sha "$GITHUB_SHA" \
--arg total "$TOTAL" \
--arg pass "$PASS" \
--arg skip "$SKIP" \
--arg fail "$FAIL" \
--arg error "$ERROR" \
'{($date): { sha: $sha, total: $total, pass: $pass, skip: $skip, fail: $fail, error: $error }}' > '${{ steps.vars.outputs.TEST_SUMMARY_FILE }}'
HASH=$(sha1sum '${{ steps.vars.outputs.TEST_SUMMARY_FILE }}' | cut --delim=" " -f 1)
outputs HASH TOTAL PASS FAIL SKIP
- name: Upload SHA1/ID of 'test-summary'
uses: actions/upload-artifact@v7
with:
name: "${{ steps.summary.outputs.HASH }}"
path: "${{ steps.vars.outputs.TEST_SUMMARY_FILE }}"
- name: Upload test results summary
uses: actions/upload-artifact@v7
with:
name: test-summary
path: "${{ steps.vars.outputs.TEST_SUMMARY_FILE }}"
- name: Compare test failures VS reference
shell: bash
run: |
## Compare test failures VS reference
REF_SUMMARY_FILE='reference/diffutils-gnu-full-result/test-results.json'
CURRENT_SUMMARY_FILE="results/test-results.json"
IGNORE_INTERMITTENT=".github/workflows/ignore-intermittent.txt"
COMMENT_DIR="reference/comment"
mkdir -p ${COMMENT_DIR}
echo ${{ github.event.number }} > ${COMMENT_DIR}/NR
COMMENT_LOG="${COMMENT_DIR}/result.txt"
COMPARISON_RESULT=0
if test -f "${CURRENT_SUMMARY_FILE}"; then
if test -f "${REF_SUMMARY_FILE}"; then
echo "Reference summary SHA1/ID: $(sha1sum -- "${REF_SUMMARY_FILE}")"
echo "Current summary SHA1/ID: $(sha1sum -- "${CURRENT_SUMMARY_FILE}")"
python3 util/compare_test_results.py \
--ignore-file "${IGNORE_INTERMITTENT}" \
--output "${COMMENT_LOG}" \
"${CURRENT_SUMMARY_FILE}" "${REF_SUMMARY_FILE}"
COMPARISON_RESULT=$?
else
echo "::warning ::Skipping test comparison; no prior reference summary is available at '${REF_SUMMARY_FILE}'."
fi
else
echo "::error ::Failed to find summary of test results (missing '${CURRENT_SUMMARY_FILE}'); failing early"
exit 1
fi
if [ ${COMPARISON_RESULT} -eq 1 ]; then
echo "::error ::Found new non-intermittent test failures"
exit 1
else
echo "::notice ::No new test failures detected"
fi
- name: Upload comparison log (for GnuComment workflow)
if: success() || failure()
uses: actions/upload-artifact@v7
with:
name: comment
path: reference/comment/
- name: Report test results
if: success() || failure()
shell: bash
run: |
## Report final results
echo "::notice ::GNU diffutils testsuite results:"
echo "::notice :: Total tests: ${{ steps.summary.outputs.TOTAL }}"
echo "::notice :: Passed: ${{ steps.summary.outputs.PASS }}"
echo "::notice :: Failed: ${{ steps.summary.outputs.FAIL }}"
echo "::notice :: Skipped: ${{ steps.summary.outputs.SKIP }}"
if [[ "${{ steps.summary.outputs.FAIL }}" -gt 0 ]]; then
PASS_RATE=$(( ${{ steps.summary.outputs.PASS }} * 100 / (${{ steps.summary.outputs.PASS }} + ${{ steps.summary.outputs.FAIL }}) ))
echo "::notice :: Pass rate: ${PASS_RATE}%"
fi
+15
View File
@@ -0,0 +1,15 @@
name: Security audit
on:
schedule:
- cron: "0 0 * * *"
jobs:
audit:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v6
with:
persist-credentials: false
- uses: rustsec/audit-check@v2
with:
token: ${{ secrets.GITHUB_TOKEN }}
+4 -38
View File
@@ -4,6 +4,7 @@ name: Basic CI
env:
CARGO_TERM_COLOR: always
CARGO_INCREMENTAL: 0
jobs:
check:
@@ -15,7 +16,6 @@ jobs:
os: [ubuntu-latest, macOS-latest, windows-latest]
steps:
- uses: actions/checkout@v4
- uses: dtolnay/rust-toolchain@stable
- run: cargo check
test:
@@ -27,12 +27,10 @@ jobs:
os: [ubuntu-latest, macOS-latest, windows-latest]
steps:
- uses: actions/checkout@v4
- uses: dtolnay/rust-toolchain@stable
- name: install GNU patch on MacOS
if: runner.os == 'macOS'
run: |
brew install gpatch
echo "/opt/homebrew/opt/gpatch/libexec/gnubin" >> "$GITHUB_PATH"
- name: set up PATH on Windows
# Needed to use GNU's patch.exe instead of Strawberry Perl patch
if: runner.os == 'Windows'
@@ -44,8 +42,6 @@ jobs:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: dtolnay/rust-toolchain@stable
- run: rustup component add rustfmt
- run: cargo fmt --all -- --check
clippy:
@@ -57,29 +53,12 @@ jobs:
os: [ubuntu-latest, macOS-latest, windows-latest]
steps:
- uses: actions/checkout@v4
- uses: dtolnay/rust-toolchain@stable
- run: rustup component add clippy
- run: cargo clippy -- -D warnings
gnu-testsuite:
name: GNU test suite
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: dtolnay/rust-toolchain@stable
- run: cargo build --release
# do not fail, the report is merely informative (at least until all tests pass reliably)
- run: ./tests/run-upstream-testsuite.sh release || true
env:
TERM: xterm
- uses: actions/upload-artifact@v4
with:
name: test-results.json
path: tests/test-results.json
- run: ./tests/print-test-results.sh tests/test-results.json
coverage:
name: Code Coverage
env:
RUSTC_BOOTSTRAP: 1
runs-on: ${{ matrix.job.os }}
strategy:
fail-fast: false
@@ -96,26 +75,15 @@ jobs:
run: |
## VARs setup
outputs() { step_id="vars"; for var in "$@" ; do echo steps.${step_id}.outputs.${var}="${!var}"; echo "${var}=${!var}" >> $GITHUB_OUTPUT; done; }
# toolchain
TOOLCHAIN="nightly" ## default to "nightly" toolchain (required for certain required unstable compiler flags) ## !maint: refactor when stable channel has needed support
# * specify gnu-type TOOLCHAIN for windows; `grcov` requires gnu-style code coverage data files
case ${{ matrix.job.os }} in windows-*) TOOLCHAIN="$TOOLCHAIN-x86_64-pc-windows-gnu" ;; esac;
# * use requested TOOLCHAIN if specified
if [ -n "${{ matrix.job.toolchain }}" ]; then TOOLCHAIN="${{ matrix.job.toolchain }}" ; fi
outputs TOOLCHAIN
# target-specific options
# * CODECOV_FLAGS
CODECOV_FLAGS=$( echo "${{ matrix.job.os }}" | sed 's/[^[:alnum:]]/_/g' )
outputs CODECOV_FLAGS
- name: rust toolchain ~ install
uses: dtolnay/rust-toolchain@nightly
- run: rustup component add llvm-tools-preview
- name: install GNU patch on MacOS
if: runner.os == 'macOS'
run: |
brew install gpatch
echo "/opt/homebrew/opt/gpatch/libexec/gnubin" >> "$GITHUB_PATH"
- name: set up PATH on Windows
# Needed to use GNU's patch.exe instead of Strawberry Perl patch
if: runner.os == 'Windows'
@@ -123,7 +91,6 @@ jobs:
- name: Test
run: cargo test --all-features --no-fail-fast
env:
CARGO_INCREMENTAL: "0"
RUSTC_WRAPPER: ""
RUSTFLAGS: "-Cinstrument-coverage -Zcoverage-options=branch -Ccodegen-units=1 -Copt-level=0 -Coverflow-checks=off -Zpanic_abort_tests -Cpanic=abort"
RUSTDOCFLAGS: "-Cpanic=abort"
@@ -160,7 +127,7 @@ jobs:
grcov . --output-type lcov --output-path "${COVERAGE_REPORT_FILE}" --binary-path "${COVERAGE_REPORT_DIR}" --branch
echo "report=${COVERAGE_REPORT_FILE}" >> $GITHUB_OUTPUT
- name: Upload coverage results (to Codecov.io)
uses: codecov/codecov-action@v5
uses: codecov/codecov-action@v7
with:
token: ${{ secrets.CODECOV_TOKEN }}
files: ${{ steps.coverage.outputs.report }}
@@ -168,4 +135,3 @@ jobs:
flags: ${{ steps.vars.outputs.CODECOV_FLAGS }}
name: codecov-umbrella
fail_ci_if_error: false
+37
View File
@@ -0,0 +1,37 @@
name: CodSpeed
on:
push:
branches:
- "main"
pull_request:
# `workflow_dispatch` allows CodSpeed to trigger backtest
# performance analysis in order to generate initial data.
workflow_dispatch:
permissions:
contents: read
id-token: write
jobs:
codspeed:
name: Run benchmarks
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Setup rust toolchain, cache and cargo-codspeed binary
uses: moonrepo/setup-rust@v1
with:
channel: stable
cache-target: release
bins: cargo-codspeed
- name: Build the benchmark target(s)
run: cargo codspeed build -m simulation
- name: Run the benchmarks
uses: CodSpeedHQ/action@v4
with:
mode: simulation
run: cargo codspeed run
+11 -7
View File
@@ -2,6 +2,10 @@ name: Fuzzing
# spell-checker:ignore fuzzer
env:
CARGO_INCREMENTAL: 0
RUSTC_BOOTSTRAP: 1
on:
pull_request:
push:
@@ -21,15 +25,15 @@ jobs:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: dtolnay/rust-toolchain@nightly
- name: Install `cargo-fuzz`
run: cargo install cargo-fuzz
run: |
cargo install cargo-fuzz --locked
- uses: Swatinem/rust-cache@v2
with:
shared-key: "cargo-fuzz-cache-key"
cache-directories: "fuzz/target"
- name: Run `cargo-fuzz build`
run: cargo +nightly fuzz build
run: cargo fuzz build
fuzz-run:
needs: fuzz-build
@@ -49,9 +53,9 @@ jobs:
- { name: fuzz_side, should_pass: true }
steps:
- uses: actions/checkout@v4
- uses: dtolnay/rust-toolchain@nightly
- name: Install `cargo-fuzz`
run: cargo install cargo-fuzz
run: |
cargo install cargo-fuzz --locked
- uses: Swatinem/rust-cache@v2
with:
shared-key: "cargo-fuzz-cache-key"
@@ -64,9 +68,9 @@ jobs:
fuzz/corpus/${{ matrix.test-target.name }}
- name: Run ${{ matrix.test-target.name }} for XX seconds
shell: bash
continue-on-error: ${{ !matrix.test-target.name.should_pass }}
continue-on-error: ${{ !matrix.test-target.should_pass }}
run: |
cargo +nightly fuzz run ${{ matrix.test-target.name }} -- -max_total_time=${{ env.RUN_FOR }} -detect_leaks=0
cargo fuzz run ${{ matrix.test-target.name }} -- -max_total_time=${{ env.RUN_FOR }} -detect_leaks=0
- name: Save Corpus Cache
uses: actions/cache/save@v5
with:
+11 -11
View File
@@ -66,7 +66,7 @@ jobs:
shell: bash
run: "curl --proto '=https' --tlsv1.2 -LsSf https://github.com/axodotdev/cargo-dist/releases/download/v0.30.3/cargo-dist-installer.sh | sh"
- name: Cache dist
uses: actions/upload-artifact@v4
uses: actions/upload-artifact@v7
with:
name: cargo-dist-cache
path: ~/.cargo/bin/dist
@@ -82,7 +82,7 @@ jobs:
cat plan-dist-manifest.json
echo "manifest=$(jq -c "." plan-dist-manifest.json)" >> "$GITHUB_OUTPUT"
- name: "Upload dist-manifest.json"
uses: actions/upload-artifact@v4
uses: actions/upload-artifact@v7
with:
name: artifacts-plan-dist-manifest
path: plan-dist-manifest.json
@@ -131,7 +131,7 @@ jobs:
run: ${{ matrix.install_dist.run }}
# Get the dist-manifest
- name: Fetch local artifacts
uses: actions/download-artifact@v4
uses: actions/download-artifact@v8
with:
pattern: artifacts-*
path: target/distrib/
@@ -158,7 +158,7 @@ jobs:
cp dist-manifest.json "$BUILD_MANIFEST_NAME"
- name: "Upload artifacts"
uses: actions/upload-artifact@v4
uses: actions/upload-artifact@v7
with:
name: artifacts-build-local-${{ join(matrix.targets, '_') }}
path: |
@@ -180,14 +180,14 @@ jobs:
persist-credentials: false
submodules: recursive
- name: Install cached dist
uses: actions/download-artifact@v4
uses: actions/download-artifact@v8
with:
name: cargo-dist-cache
path: ~/.cargo/bin/
- run: chmod +x ~/.cargo/bin/dist
# Get all the local artifacts for the global tasks to use (for e.g. checksums)
- name: Fetch local artifacts
uses: actions/download-artifact@v4
uses: actions/download-artifact@v8
with:
pattern: artifacts-*
path: target/distrib/
@@ -205,7 +205,7 @@ jobs:
cp dist-manifest.json "$BUILD_MANIFEST_NAME"
- name: "Upload artifacts"
uses: actions/upload-artifact@v4
uses: actions/upload-artifact@v7
with:
name: artifacts-build-global
path: |
@@ -230,14 +230,14 @@ jobs:
persist-credentials: false
submodules: recursive
- name: Install cached dist
uses: actions/download-artifact@v4
uses: actions/download-artifact@v8
with:
name: cargo-dist-cache
path: ~/.cargo/bin/
- run: chmod +x ~/.cargo/bin/dist
# Fetch artifacts from scratch-storage
- name: Fetch artifacts
uses: actions/download-artifact@v4
uses: actions/download-artifact@v8
with:
pattern: artifacts-*
path: target/distrib/
@@ -250,14 +250,14 @@ jobs:
cat dist-manifest.json
echo "manifest=$(jq -c "." dist-manifest.json)" >> "$GITHUB_OUTPUT"
- name: "Upload dist-manifest.json"
uses: actions/upload-artifact@v4
uses: actions/upload-artifact@v7
with:
# Overwrite the previous copy
name: artifacts-dist-manifest
path: dist-manifest.json
# Create a GitHub Release while uploading all files to it
- name: "Download GitHub Artifacts"
uses: actions/download-artifact@v4
uses: actions/download-artifact@v8
with:
pattern: artifacts-*
path: artifacts
+28
View File
@@ -0,0 +1,28 @@
# spell-checker:ignore wasip
name: WASI
on:
pull_request:
push:
branches:
- main
permissions:
contents: read
# End the current execution if there is a new changeset in the PR.
concurrency:
group: ${{ github.workflow }}-${{ github.ref }}
cancel-in-progress: ${{ github.ref != 'refs/heads/main' }}
jobs:
test_wasi:
name: Tests
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v6
- uses: dtolnay/rust-toolchain@stable
with:
targets: wasm32-wasip1
- name: check
run: cargo check --target wasm32-wasip1
+48
View File
@@ -0,0 +1,48 @@
# See https://pre-commit.com for more information
# See https://pre-commit.com/hooks.html for more hooks
exclude: ^tests/fixtures/
repos:
- repo: https://github.com/pre-commit/pre-commit-hooks
rev: v6.0.0
hooks:
- id: check-added-large-files
- id: check-executables-have-shebangs
- id: check-json
exclude: '\.vscode/(cSpell|extensions)\.json' # cSpell.json and extensions.json use comments
- id: check-shebang-scripts-are-executable
exclude: '.+\.rs' # would be triggered by #![some_attribute]
- id: check-symlinks
- id: check-toml
- id: check-yaml
args: [ --allow-multiple-documents ]
- id: destroyed-symlinks
- id: end-of-file-fixer
- id: mixed-line-ending
args: [ --fix=lf ]
- id: trailing-whitespace
- repo: local
hooks:
- id: rust-linting
name: Rust linting
description: Run cargo fmt on files included in the commit.
entry: cargo +stable fmt --
pass_filenames: true
types: [file, rust]
language: system
- id: rust-clippy
name: Rust clippy
description: Run cargo clippy on files included in the commit.
entry: cargo +stable clippy --workspace --all-targets --all-features -- -D warnings
pass_filenames: false
types: [file, rust]
language: system
- id: cspell
name: Code spell checker (cspell)
description: Run cspell to check for spelling errors (if available).
entry: bash -c 'if command -v cspell >/dev/null 2>&1; then cspell --no-must-find-files -- "$@"; else echo "cspell not found, skipping spell check"; exit 0; fi' --
pass_filenames: true
language: system
ci:
skip: [rust-linting, rust-clippy, cspell]
+32
View File
@@ -0,0 +1,32 @@
# Contributing to diffutils
Hi! Welcome to uutils/diffutils, and thanks for wanting to contribute!
This project follows the shared conventions of the [uutils](https://github.com/uutils)
organization. Before opening a pull request, please read:
- Our **[Review Guidelines](https://uutils.github.io/reviews/)** — what we expect
from a pull request and how reviews are carried out.
- Our community's [CODE_OF_CONDUCT.md](./CODE_OF_CONDUCT.md), if present.
Finally, feel free to join our [Discord](https://discord.gg/wQVJbvJ)!
> [!WARNING]
> uutils is original code and cannot contain any code from GNU or other
> strongly-licensed (GPL/LGPL) implementations. We **cannot** accept changes
> based on the GNU source code, and you **must not link** to it either. You may
> look at permissively-licensed implementations (MIT/BSD) and read the GNU
> *manuals* — never the GNU *source*.
## In short
- Discuss non-trivial changes in an issue **before** writing the code.
- Keep pull requests **small, self-contained, and descriptively titled**
(e.g. `diffutils: fix ...`).
- Make sure CI passes: tests are green, `rustfmt` is satisfied, and there are
no `clippy` warnings.
- Add tests for new behavior; don't let coverage regress.
- Write small, atomic commits annotated with the component you touched.
See the [Review Guidelines](https://uutils.github.io/reviews/) for the full
details.
+8
View File
@@ -0,0 +1,8 @@
Copyright (c) Michael Howell
Copyright (c) uutils developers
Licensed under the Apache License, Version 2.0 <LICENSE-APACHE or
http://www.apache.org/licenses/LICENSE-2.0> or the MIT license
<LICENSE-MIT or http://opensource.org/licenses/MIT>, at your
option. All files in the project carrying such notice may not be
copied, modified, or distributed except according to those terms.
Generated
+734 -147
View File
File diff suppressed because it is too large Load Diff
+17 -7
View File
@@ -23,20 +23,30 @@ same-file = "1.0.6"
unicode-width = "0.2.0"
[dev-dependencies]
pretty_assertions = "1.4.0"
assert_cmd = "2.0.14"
divan = { version = "4.3.0", package = "codspeed-divan-compat" }
pretty_assertions = "1.4.0"
predicates = "3.1.0"
tempfile = "3.10.1"
rand = "0.10.0"
tempfile = "3.26.0"
[profile.release]
lto = "thin"
codegen-units = 1
[profile.release-fast]
inherits = "release"
panic = "abort"
# The profile that 'dist' will build with
# alias profile for 'dist'
[profile.dist]
inherits = "release"
lto = "thin"
[[bench]]
name = "bench_diffutils"
path = "benches/bench-diffutils.rs"
harness = false
[features]
# default = ["feat_bench_not_diff"]
# Turn bench for diffutils cmp off
feat_bench_not_cmp = []
# Turn bench for diffutils diff off
feat_bench_not_diff = []
-3
View File
@@ -1,6 +1,3 @@
Copyright (c) Michael Howell
Copyright (c) uutils developers
Apache License
Version 2.0, January 2004
http://www.apache.org/licenses/
-3
View File
@@ -1,6 +1,3 @@
Copyright (c) Michael Howell
Copyright (c) uutils developers
Permission is hereby granted, free of charge, to any
person obtaining a copy of this software and associated
documentation files (the "Software"), to deal in the
+6 -1
View File
@@ -2,6 +2,7 @@
[![Discord](https://img.shields.io/badge/discord-join-7289DA.svg?logo=discord&longCache=true&style=flat)](https://discord.gg/wQVJbvJ)
[![License](http://img.shields.io/badge/license-MIT-blue.svg)](https://github.com/uutils/diffutils/blob/main/LICENSE)
[![dependency status](https://deps.rs/repo/github/uutils/diffutils/status.svg)](https://deps.rs/repo/github/uutils/diffutils)
[![CodSpeed](https://img.shields.io/endpoint?url=https://codspeed.io/badge.json)](https://codspeed.io/uutils/diffutils?utm_source=badge)
[![CodeCov](https://codecov.io/gh/uutils/diffutils/branch/main/graph/badge.svg)](https://codecov.io/gh/uutils/diffutils)
@@ -53,4 +54,8 @@ $ cargo run -- -u fruits_old.txt fruits_new.txt
## License
diffutils is licensed under the MIT and Apache Licenses - see the `LICENSE-MIT` or `LICENSE-APACHE` files for details
This project is distributed under the terms of both the MIT license and the
Apache License (Version 2.0).
See [LICENSE-APACHE](LICENSE-APACHE), [LICENSE-MIT](LICENSE-MIT), and
[COPYRIGHT](COPYRIGHT) for details.
+44
View File
@@ -0,0 +1,44 @@
# Security Policy
## Supported Versions
We provide security updates only for the latest released version of `uutils/diffutils`.
Older versions may not receive patches.
If you are using a version packaged by your Linux distribution, please check with your distribution maintainers for their update policy.
---
## Reporting a Vulnerability
**Do not open public GitHub issues for security vulnerabilities.**
This prevents accidental disclosure before a fix is available.
Instead, please use the following method:
- **Email:** [sylvestre@debian.org](mailto:Sylvestre@debian.org)
- **Encryption (optional):** You may encrypt your report using our PGP key:
Fingerprint: B60D B599 4D39 BEC4 D1A9 5CCF 7E65 28DA 752F 1BE1
---
### What to Include in Your Report
To help us investigate and resolve the issue quickly, please include as much detail as possible:
- **Type of issue:** e.g. privilege escalation, information disclosure.
- **Location in the source:** file path, commit hash, branch, or tag.
- **Steps to reproduce:** exact commands, test cases, or scripts.
- **Special configuration:** any flags, environment variables, or system setup required.
- **Affected systems:** OS/distribution and version(s) where the issue occurs.
- **Impact:** your assessment of the potential severity (DoS, RCE, data leak, etc.).
---
## Disclosure Policy
We follow a **Coordinated Vulnerability Disclosure (CVD)** process:
1. We will acknowledge receipt of your report within **10 days**.
2. We will investigate, reproduce, and assess the issue.
3. We will provide a timeline for developing and releasing a fix.
4. Once a fix is available, we will publish a GitHub Security Advisory.
5. You will be credited in the advisory unless you request anonymity.
+377
View File
@@ -0,0 +1,377 @@
// This file is part of the uutils diffutils package.
//
// For the full copyright and license information, please view the LICENSE-*
// files that was distributed with this source code.
//! Benches for all utils in diffutils.
//!
//! There is a file generator included to create files of different sizes for comparison. \
//! Set the TEMP_DIR const to keep the files. df_to_ files have small changes in them, search for '#'. \
//! File generation up to 1 GB is really fast, Benchmarking above 100 MB takes very long.
/// Generate test files with these sizes in KB.
const FILE_SIZE_KILO_BYTES: [u64; 4] = [100, 1 * MB, 10 * MB, 25 * MB];
// const FILE_SIZE_KILO_BYTES: [u64; 3] = [100, 1 * MB, 5 * MB];
// Empty String to use TempDir (files will be removed after test) or specify dir to keep generated files
const TEMP_DIR: &str = "";
const NUM_DIFF: u64 = 4;
// just for FILE_SIZE_KILO_BYTES
const MB: u64 = 1_000;
const CHANGE_CHAR: u8 = b'#';
#[cfg(not(feature = "feat_bench_not_cmp"))]
mod diffutils_cmp {
use std::hint::black_box;
use diffutilslib::cmp;
use divan::Bencher;
use crate::{binary, prepare::*, FILE_SIZE_KILO_BYTES};
#[divan::bench(args = FILE_SIZE_KILO_BYTES)]
fn cmp_compare_files_equal(bencher: Bencher, kb: u64) {
let (from, to) = get_context().get_test_files_equal(kb);
let cmd = format!("cmp {from} {to}");
let opts = str_to_options(&cmd).into_iter().peekable();
let params = cmp::parse_params(opts).unwrap();
bencher
// .with_inputs(|| prepare::cmp_params_identical_testfiles(lines))
.with_inputs(|| params.clone())
.bench_refs(|params| black_box(cmp::cmp(&params).unwrap()));
}
// bench the actual compare; cmp exits on first difference
#[divan::bench(args = FILE_SIZE_KILO_BYTES)]
fn cmp_compare_files_different(bencher: Bencher, bytes: u64) {
let (from, to) = get_context().get_test_files_different(bytes);
let cmd = format!("cmp {from} {to} -s");
let opts = str_to_options(&cmd).into_iter().peekable();
let params = cmp::parse_params(opts).unwrap();
bencher
// .with_inputs(|| prepare::cmp_params_identical_testfiles(lines))
.with_inputs(|| params.clone())
.bench_refs(|params| black_box(cmp::cmp(&params).unwrap()));
}
// bench original GNU cmp
#[divan::bench(args = FILE_SIZE_KILO_BYTES)]
fn cmd_cmp_gnu_equal(bencher: Bencher, bytes: u64) {
let (from, to) = get_context().get_test_files_equal(bytes);
let args_str = format!("{from} {to}");
bencher
// .with_inputs(|| prepare::cmp_params_identical_testfiles(lines))
.with_inputs(|| args_str.clone())
.bench_refs(|cmd_args| binary::bench_binary("cmp", cmd_args));
}
// bench the compiled release version
#[divan::bench(args = FILE_SIZE_KILO_BYTES)]
fn cmd_cmp_release_equal(bencher: Bencher, bytes: u64) {
let (from, to) = get_context().get_test_files_equal(bytes);
let args_str = format!("cmp {from} {to}");
bencher
// .with_inputs(|| prepare::cmp_params_identical_testfiles(lines))
.with_inputs(|| args_str.clone())
.bench_refs(|cmd_args| binary::bench_binary("target/release/diffutils", cmd_args));
}
}
#[cfg(not(feature = "feat_bench_not_diff"))]
mod diffutils_diff {
// use std::hint::black_box;
use crate::{binary, prepare::*, FILE_SIZE_KILO_BYTES};
// use diffutilslib::params;
use divan::Bencher;
// bench the actual compare
// TODO diff does not have a diff function
// #[divan::bench(args = [100_000,10_000])]
// fn diff_compare_files(bencher: Bencher, bytes: u64) {
// let (from, to) = gen_testfiles(lines, 0, "id");
// let cmd = format!("cmp {from} {to}");
// let opts = str_to_options(&cmd).into_iter().peekable();
// let params = params::parse_params(opts).unwrap();
//
// bencher
// // .with_inputs(|| prepare::cmp_params_identical_testfiles(lines))
// .with_inputs(|| params.clone())
// .bench_refs(|params| diff::diff(&params).unwrap());
// }
// bench original GNU diff
#[divan::bench(args = FILE_SIZE_KILO_BYTES)]
fn cmd_diff_gnu_equal(bencher: Bencher, bytes: u64) {
let (from, to) = get_context().get_test_files_equal(bytes);
let args_str = format!("{from} {to}");
bencher
// .with_inputs(|| prepare::cmp_params_identical_testfiles(lines))
.with_inputs(|| args_str.clone())
.bench_refs(|cmd_args| binary::bench_binary("diff", cmd_args));
}
// bench the compiled release version
#[divan::bench(args = FILE_SIZE_KILO_BYTES)]
fn cmd_diff_release_equal(bencher: Bencher, bytes: u64) {
let (from, to) = get_context().get_test_files_equal(bytes);
let args_str = format!("diff {from} {to}");
bencher
// .with_inputs(|| prepare::cmp_params_identical_testfiles(lines))
.with_inputs(|| args_str.clone())
.bench_refs(|cmd_args| binary::bench_binary("target/release/diffutils", cmd_args));
}
}
mod parser {
use std::hint::black_box;
use diffutilslib::{cmp, params};
use divan::Bencher;
use crate::prepare::str_to_options;
// bench the time it takes to parse the command line arguments
#[divan::bench]
fn cmp_parser(bencher: Bencher) {
let cmd = "cmd file_1.txt file_2.txt -bl n10M --ignore-initial=100KiB:1MiB";
let args = str_to_options(&cmd).into_iter().peekable();
bencher
.with_inputs(|| args.clone())
.bench_values(|data| black_box(cmp::parse_params(data)));
}
// // test the impact on the benchmark if not converting the cmd to Vec<OsString> (doubles for parse)
// #[divan::bench]
// fn cmp_parser_no_prepare() {
// let cmd = "cmd file_1.txt file_2.txt -bl n10M --ignore-initial=100KiB:1MiB";
// let args = str_to_options(&cmd).into_iter().peekable();
// let _ = cmp::parse_params(args);
// }
// bench the time it takes to parse the command line arguments
#[divan::bench]
fn diff_parser(bencher: Bencher) {
let cmd = "diff file_1.txt file_2.txt -s --brief --expand-tabs --width=100";
let args = str_to_options(&cmd).into_iter().peekable();
bencher
.with_inputs(|| args.clone())
.bench_values(|data| black_box(params::parse_params(data)));
}
}
mod prepare {
use std::{
ffi::OsString,
fs::{self, File},
io::{BufWriter, Write},
path::Path,
sync::OnceLock,
};
use rand::RngExt;
use tempfile::TempDir;
use crate::{CHANGE_CHAR, FILE_SIZE_KILO_BYTES, NUM_DIFF, TEMP_DIR};
// file lines and .txt will be added
const FROM_FILE: &str = "from_file";
const TO_FILE: &str = "to_file";
const LINE_LENGTH: usize = 60;
/// Contains test data (file names) which only needs to be created once.
#[derive(Debug, Default)]
pub struct BenchContext {
pub tmp_dir: Option<TempDir>,
pub dir: String,
pub files_equal: Vec<(String, String)>,
pub files_different: Vec<(String, String)>,
}
impl BenchContext {
pub fn get_path(&self) -> &Path {
match &self.tmp_dir {
Some(tmp) => tmp.path(),
None => Path::new(&self.dir),
}
}
pub fn get_test_files_equal(&self, kb: u64) -> &(String, String) {
let p = FILE_SIZE_KILO_BYTES.iter().position(|f| *f == kb).unwrap();
&self.files_equal[p]
}
#[allow(unused)]
pub fn get_test_files_different(&self, kb: u64) -> &(String, String) {
let p = FILE_SIZE_KILO_BYTES.iter().position(|f| *f == kb).unwrap();
&self.files_different[p]
}
}
// Since each bench function is separate in Divan it is more difficult to dynamically create test data.
// This keeps the TempDir alive until the program exits and generates the files only once.
static SHARED_CONTEXT: OnceLock<BenchContext> = OnceLock::new();
/// Creates the test files once and provides them to all tests.
pub fn get_context() -> &'static BenchContext {
SHARED_CONTEXT.get_or_init(|| {
let mut ctx = BenchContext::default();
if TEMP_DIR.is_empty() {
let tmp_dir = TempDir::new().expect("Failed to create temp dir");
ctx.tmp_dir = Some(tmp_dir);
} else {
// uses current directory, the generated files are kept
let path = Path::new(TEMP_DIR);
if !path.exists() {
fs::create_dir_all(path).expect("Path {path} could not be created");
}
ctx.dir = TEMP_DIR.to_string();
};
// generate test bytes
for kb in FILE_SIZE_KILO_BYTES {
let f = generate_test_files_bytes(ctx.get_path(), kb * 1000, 0, "eq")
.expect("generate_test_files failed");
ctx.files_equal.push(f);
let f = generate_test_files_bytes(ctx.get_path(), kb * 1000, NUM_DIFF, "df")
.expect("generate_test_files failed");
ctx.files_different.push(f);
}
ctx
})
}
pub fn str_to_options(opt: &str) -> Vec<OsString> {
let s: Vec<OsString> = opt
.split(" ")
.into_iter()
.filter(|s| !s.is_empty())
.map(|s| OsString::from(s))
.collect();
s
}
/// Generates two test files for comparison with <bytes> size.
///
/// Each line consists of 10 words with 5 letters, giving a line length of 60 bytes.
/// If num_differences is set, '#' will be inserted between the first two words of a line,
/// evenly spaced in the file. 1 will add the change in the last line, so the comparison takes longest.
fn generate_test_files_bytes(
dir: &Path,
bytes: u64,
num_differences: u64,
id: &str,
) -> std::io::Result<(String, String)> {
let id = if id.is_empty() {
"".to_string()
} else {
format!("{id}_")
};
let f1 = format!("{id}{FROM_FILE}_{bytes}.txt");
let f2 = format!("{id}{TO_FILE}_{bytes}.txt");
let from_path = dir.join(f1);
let to_path = dir.join(f2);
generate_file_bytes(&from_path, &to_path, bytes, num_differences)?;
Ok((
from_path.to_string_lossy().to_string(),
to_path.to_string_lossy().to_string(),
))
}
fn generate_file_bytes(
from_name: &Path,
to_name: &Path,
bytes: u64,
num_differences: u64,
) -> std::io::Result<()> {
let file_from = File::create(from_name)?;
let file_to = File::create(to_name)?;
// for int division, lines will be smaller than requested bytes
let n_lines = bytes / LINE_LENGTH as u64;
let change_every_n_lines = if num_differences == 0 {
0
} else {
let c = n_lines / num_differences;
if c == 0 {
1
} else {
c
}
};
// Use a larger 128KB buffer for massive files
let mut writer_from = BufWriter::with_capacity(128 * 1024, file_from);
let mut writer_to = BufWriter::with_capacity(128 * 1024, file_to);
let mut rng = rand::rng();
// Each line: (5 chars * 10 words) + 9 spaces + 1 newline = 60 bytes
let mut line_buffer = [b' '; 60];
line_buffer[59] = b'\n'; // Set the newline once at the end
for i in (0..n_lines).rev() {
// Fill only the letter positions, skipping spaces and the newline
for word_idx in 0..10 {
let start = word_idx * 6; // Each word + space block is 6 bytes
for i in 0..5 {
line_buffer[start + i] = rng.random_range(b'a'..b'z' + 1);
}
}
// Write the raw bytes directly to both files
writer_from.write_all(&line_buffer)?;
// make changes in the file
if num_differences == 0 {
writer_to.write_all(&line_buffer)?;
} else {
if i % change_every_n_lines == 0 && n_lines - i > 2 {
line_buffer[5] = CHANGE_CHAR;
}
writer_to.write_all(&line_buffer)?;
line_buffer[5] = b' ';
}
}
// create last line
let missing = (bytes - n_lines as u64 * LINE_LENGTH as u64) as usize;
if missing > 0 {
for word_idx in 0..10 {
let start = word_idx * 6; // Each word + space block is 6 bytes
for i in 0..5 {
line_buffer[start + i] = rng.random_range(b'a'..b'z' + 1);
}
}
line_buffer[missing - 1] = b'\n';
writer_from.write_all(&line_buffer[0..missing])?;
writer_to.write_all(&line_buffer[0..missing])?;
}
writer_from.flush()?;
writer_to.flush()?;
Ok(())
}
}
mod binary {
use std::process::Command;
use crate::prepare::str_to_options;
pub fn bench_binary(program: &str, cmd_args: &str) -> std::process::ExitStatus {
let args = str_to_options(cmd_args);
Command::new(program)
.args(args)
.status()
.expect("Failed to execute binary")
}
}
fn main() {
// Run registered benchmarks.
divan::main();
}
+48 -48
View File
@@ -34,15 +34,15 @@ checksum = "c08606f8c3cbf4ce6ec8e28fb0014a2c086708fe954eaa885384a6165172e7e8"
[[package]]
name = "bumpalo"
version = "3.19.1"
version = "3.20.2"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "5dd9dc738b7a8311c7ade152424974d8115f2cdad61e8dab8dac9f2362298510"
checksum = "5d20789868f4b01b2f2caec9f5c4e0213b41e3e5702a50157d699ae31ced2fcb"
[[package]]
name = "cc"
version = "1.2.51"
version = "1.2.60"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "7a0aeaff4ff1a90589618835a598e545176939b97874f7abc7851caa0618f203"
checksum = "43c5703da9466b66a946814e1adf53ea2c90f10063b86290cc9eb67ce3478a20"
dependencies = [
"find-msvc-tools",
"jobserver",
@@ -58,9 +58,9 @@ checksum = "9330f8b2ff13f34540b44e946ef35111825727b38d33286ef986142615121801"
[[package]]
name = "chrono"
version = "0.4.42"
version = "0.4.44"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "145052bdd345b87320e369255277e3fb5152762ad123a901ef5c262dd38fe8d2"
checksum = "c673075a2e0e5f4a1dde27ce9dee1ea4558c7ffe648f576438a20ca1d2acc4b0"
dependencies = [
"iana-time-zone",
"js-sys",
@@ -95,9 +95,9 @@ dependencies = [
[[package]]
name = "find-msvc-tools"
version = "0.1.6"
version = "0.1.9"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "645cbb3a84e60b7531617d5ae4e57f7e27308f6445f5abf653209ea76dec8dff"
checksum = "5baebc0774151f905a1a2cc41989300b1e6fbb29aff0ceffa1064fdd3088d582"
[[package]]
name = "getrandom"
@@ -113,9 +113,9 @@ dependencies = [
[[package]]
name = "iana-time-zone"
version = "0.1.64"
version = "0.1.65"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "33e57f83510bb73707521ebaffa789ec8caf86f9657cad665b092b581d40e9fb"
checksum = "e31bc9ad994ba00e440a8aa5c9ef0ec67d5cb5e5cb0cc7f8b744a35b389cc470"
dependencies = [
"android_system_properties",
"core-foundation-sys",
@@ -137,9 +137,9 @@ dependencies = [
[[package]]
name = "itoa"
version = "1.0.17"
version = "1.0.18"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "92ecc6618181def0457392ccd0ee51198e065e016d1d527a7ac1b6dc7c1f09d2"
checksum = "8f42a60cbdf9a97f5d2305f08a87dc4e09308d1276d28c869c684d7777685682"
[[package]]
name = "jobserver"
@@ -153,9 +153,9 @@ dependencies = [
[[package]]
name = "js-sys"
version = "0.3.83"
version = "0.3.94"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "464a3709c7f55f1f721e5389aa6ea4e3bc6aba669353300af094b29ffbdde1d8"
checksum = "2e04e2ef80ce82e13552136fabeef8a5ed1f985a96805761cbb9a2c34e7664d9"
dependencies = [
"once_cell",
"wasm-bindgen",
@@ -163,15 +163,15 @@ dependencies = [
[[package]]
name = "libc"
version = "0.2.178"
version = "0.2.184"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "37c93d8daa9d8a012fd8ab92f088405fb202ea0b6ab73ee2482ae66af4f42091"
checksum = "48f5d2a454e16a5ea0f4ced81bd44e4cfc7bd3a507b61887c99fd3538b28e4af"
[[package]]
name = "libfuzzer-sys"
version = "0.4.10"
version = "0.4.13"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "5037190e1f70cbeef565bd267599242926f724d3b8a9f510fd7e0b540cfa4404"
checksum = "a9fd2f41a1cba099f79a0b6b6c35656cf7c03351a7bae8ff0f28f25270f929d2"
dependencies = [
"arbitrary",
"cc",
@@ -185,9 +185,9 @@ checksum = "5e5032e24019045c762d3c0f28f5b6b8bbf38563a65908389bf7978758920897"
[[package]]
name = "memchr"
version = "2.7.6"
version = "2.8.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "f52b00d39961fc5b2736ea853c9cc86238e165017a493d1d5c8eac6bdc4cc273"
checksum = "f8ca58f447f06ed17d5fc4043ce1b10dd205e060fb3ce5b979b8ed8e59ff3f79"
[[package]]
name = "num-traits"
@@ -200,24 +200,24 @@ dependencies = [
[[package]]
name = "once_cell"
version = "1.21.3"
version = "1.21.4"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "42f5e15c9953c5e4ccceeb2e7382a716482c34515315f7b03532b8b4e8393d2d"
checksum = "9f7c3e4beb33f85d45ae3e3a1792185706c8e16d043238c593331cc7cd313b50"
[[package]]
name = "proc-macro2"
version = "1.0.104"
version = "1.0.106"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "9695f8df41bb4f3d222c95a67532365f569318332d03d5f3f67f37b20e6ebdf0"
checksum = "8fd00f0bb2e90d81d1044c2b32617f68fcb9fa3bb7640c23e9c748e53fb30934"
dependencies = [
"unicode-ident",
]
[[package]]
name = "quote"
version = "1.0.42"
version = "1.0.45"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "a338cc41d27e6cc6dce6cefc13a0729dfbb81c262b1f519331575dd80ef3067f"
checksum = "41f2619966050689382d2b44f664f4bc593e129785a36d6ee376ddf37259b924"
dependencies = [
"proc-macro2",
]
@@ -230,9 +230,9 @@ checksum = "69cdb34c158ceb288df11e18b4bd39de994f6657d83847bdffdbd7f346754b0f"
[[package]]
name = "regex"
version = "1.12.2"
version = "1.12.3"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "843bc0191f75f3e22651ae5f1e72939ab2f72a4bc30fa80a066bd66edefc24d4"
checksum = "e10754a14b9137dd7b1e3e5b0493cc9171fdd105e0ab477f51b72e7f3ac0e276"
dependencies = [
"aho-corasick",
"memchr",
@@ -242,9 +242,9 @@ dependencies = [
[[package]]
name = "regex-automata"
version = "0.4.13"
version = "0.4.14"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "5276caf25ac86c8d810222b3dbb938e512c55c6831a10f3e6ed1c93b84041f1c"
checksum = "6e1dd4122fc1595e8162618945476892eefca7b88c52820e74af6262213cae8f"
dependencies = [
"aho-corasick",
"memchr",
@@ -253,9 +253,9 @@ dependencies = [
[[package]]
name = "regex-syntax"
version = "0.8.8"
version = "0.8.10"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "7a2d987857b319362043e95f5353c0535c1f58eec5336fdfcf626430af7def58"
checksum = "dc897dd8d9e8bd1ed8cdad82b5966c3e0ecae09fb1907d58efaa013543185d0a"
[[package]]
name = "rustversion"
@@ -280,9 +280,9 @@ checksum = "0fda2ff0d084019ba4d7c6f371c95d8fd75ce3524c3cb8fb653a3023f6323e64"
[[package]]
name = "syn"
version = "2.0.112"
version = "2.0.117"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "21f182278bf2d2bcb3c88b1b08a37df029d71ce3d3ae26168e3c653b213b99d4"
checksum = "e665b8803e7b1d2a727f4023456bbbbe74da67099c585258af0ad9c5013b9b99"
dependencies = [
"proc-macro2",
"quote",
@@ -291,9 +291,9 @@ dependencies = [
[[package]]
name = "unicode-ident"
version = "1.0.22"
version = "1.0.24"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "9312f7c4f6ff9069b165498234ce8be658059c6728633667c526e27dc2cf1df5"
checksum = "e6e4313cd5fcd3dad5cafa179702e2b244f760991f45397d14d4ebf38247da75"
[[package]]
name = "unicode-width"
@@ -311,18 +311,18 @@ dependencies = [
[[package]]
name = "wasip2"
version = "1.0.1+wasi-0.2.4"
version = "1.0.2+wasi-0.2.9"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "0562428422c63773dad2c345a1882263bbf4d65cf3f42e90921f787ef5ad58e7"
checksum = "9517f9239f02c069db75e65f174b3da828fe5f5b945c4dd26bd25d89c03ebcf5"
dependencies = [
"wit-bindgen",
]
[[package]]
name = "wasm-bindgen"
version = "0.2.106"
version = "0.2.117"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "0d759f433fa64a2d763d1340820e46e111a7a5ab75f993d1852d70b03dbb80fd"
checksum = "0551fc1bb415591e3372d0bc4780db7e587d84e2a7e79da121051c5c4b89d0b0"
dependencies = [
"cfg-if",
"once_cell",
@@ -333,9 +333,9 @@ dependencies = [
[[package]]
name = "wasm-bindgen-macro"
version = "0.2.106"
version = "0.2.117"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "48cb0d2638f8baedbc542ed444afc0644a29166f1595371af4fecf8ce1e7eeb3"
checksum = "7fbdf9a35adf44786aecd5ff89b4563a90325f9da0923236f6104e603c7e86be"
dependencies = [
"quote",
"wasm-bindgen-macro-support",
@@ -343,9 +343,9 @@ dependencies = [
[[package]]
name = "wasm-bindgen-macro-support"
version = "0.2.106"
version = "0.2.117"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "cefb59d5cd5f92d9dcf80e4683949f15ca4b511f4ac0a6e14d4e1ac60c6ecd40"
checksum = "dca9693ef2bab6d4e6707234500350d8dad079eb508dca05530c85dc3a529ff2"
dependencies = [
"bumpalo",
"proc-macro2",
@@ -356,9 +356,9 @@ dependencies = [
[[package]]
name = "wasm-bindgen-shared"
version = "0.2.106"
version = "0.2.117"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "cbc538057e648b67f72a982e708d485b2efa771e1ac05fec311f9f63e5800db4"
checksum = "39129a682a6d2d841b6c429d0c51e5cb0ed1a03829d8b3d1e69a011e62cb3d3b"
dependencies = [
"unicode-ident",
]
@@ -442,6 +442,6 @@ dependencies = [
[[package]]
name = "wit-bindgen"
version = "0.46.0"
version = "0.51.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "f17a85883d4e6d00e8a97c586de764dabcc06133f7f1d55dce5cdc070ad7fe59"
checksum = "d7249219f66ced02969388cf2bb044a09756a083d0fab1e566056b04d9fbcaa5"
+1 -1
View File
@@ -51,4 +51,4 @@ doc = false
name = "fuzz_side"
path = "fuzz_targets/fuzz_side.rs"
test = false
doc = false
doc = false
+2 -2
View File
@@ -4,7 +4,7 @@ extern crate libfuzzer_sys;
use diffutilslib::cmp::{self, Cmp};
use std::ffi::OsString;
use std::fs::File;
use std::fs::{self, File};
use std::io::Write;
fn os(s: &str) -> OsString {
@@ -18,7 +18,7 @@ fuzz_target!(|x: (Vec<u8>, Vec<u8>)| {
.peekable();
let (from, to) = x;
fs::create_dir_all("target").unwrap();
File::create("target/fuzz.cmp.a")
.unwrap()
.write_all(&from)
+3
View File
@@ -11,6 +11,9 @@ fn os(s: &str) -> OsString {
}
fuzz_target!(|x: Vec<OsString>| -> Corpus {
if x.iter().any(|a| a == "--help") {
return Corpus::Reject;
}
if x.len() > 6 {
// Make sure we try to parse an option when we get longer args. x[0] will be
// the executable name.
+1
View File
@@ -38,6 +38,7 @@ fuzz_target!(|x: (Vec<u8>, Vec<u8>)| {
} else {
return;
}
fs::create_dir_all("target").unwrap();
let diff = diff_w(&from, &to, "target/fuzz.file").unwrap();
File::create("target/fuzz.file.original")
.unwrap()
+1
View File
@@ -23,6 +23,7 @@ fuzz_target!(|x: (Vec<u8>, Vec<u8>)| {
return
}*/
let diff = normal_diff::diff(&from, &to, &Params::default());
fs::create_dir_all("target").unwrap();
File::create("target/fuzz.file.original")
.unwrap()
.write_all(&from)
+5 -3
View File
@@ -21,15 +21,17 @@ fuzz_target!(|x: (Vec<u8>, Vec<u8>, u8)| {
} else {
return
}*/
fs::create_dir_all("target").unwrap();
let patched = "target/fuzz.file";
let diff = unified_diff::diff(
&from,
&to,
&Params {
from: "a/fuzz.file".into(),
to: "target/fuzz.file".into(),
from: patched.into(),
to: patched.into(),
context_count: context as usize,
..Default::default()
}
},
);
File::create("target/fuzz.file.original")
.unwrap()
+4 -3
View File
@@ -4,9 +4,9 @@ extern crate libfuzzer_sys;
use diffutilslib::side_diff;
use std::fs::File;
use std::io::Write;
use diffutilslib::params::Params;
use std::fs::{self, File};
use std::io::Write;
fuzz_target!(|x: (Vec<u8>, Vec<u8>, /* usize, usize */ bool)| {
let (original, new, /* width, tabsize, */ expand) = x;
@@ -21,6 +21,7 @@ fuzz_target!(|x: (Vec<u8>, Vec<u8>, /* usize, usize */ bool)| {
expand_tabs: expand,
..Default::default()
};
fs::create_dir_all("target").unwrap();
let mut output_buf = vec![];
side_diff::diff(&original, &new, &mut output_buf, &params);
File::create("target/fuzz.file.original")
@@ -39,4 +40,4 @@ fuzz_target!(|x: (Vec<u8>, Vec<u8>, /* usize, usize */ bool)| {
.unwrap()
.write_all(&output_buf)
.unwrap();
});
});
+147 -129
View File
@@ -11,24 +11,29 @@ use std::iter::Peekable;
use std::process::ExitCode;
use std::{cmp, fs, io};
#[cfg(not(target_os = "windows"))]
#[cfg(unix)]
use std::os::fd::{AsRawFd, FromRawFd};
#[cfg(not(target_os = "windows"))]
#[cfg(unix)]
use std::os::unix::fs::MetadataExt;
#[cfg(target_os = "windows")]
use std::os::windows::fs::MetadataExt;
/// for --bytes, so really large number limits can be expressed, like 1Y.
pub type BytesLimitU64 = u64;
// ignore initial is currently limited to u64, as take(skip) is used.
pub type SkipU64 = u64;
#[derive(Clone, Debug, Default, Eq, PartialEq)]
pub struct Params {
executable: OsString,
from: OsString,
to: OsString,
print_bytes: bool,
skip_a: Option<usize>,
skip_b: Option<usize>,
max_bytes: Option<usize>,
skip_a: Option<SkipU64>,
skip_b: Option<SkipU64>,
max_bytes: Option<BytesLimitU64>,
verbose: bool,
quiet: bool,
}
@@ -38,7 +43,7 @@ fn usage_string(executable: &str) -> String {
format!("Usage: {executable} <from> <to>")
}
#[cfg(not(target_os = "windows"))]
#[cfg(unix)]
fn is_stdout_dev_null() -> bool {
let Ok(dev_null) = fs::metadata("/dev/null") else {
return false;
@@ -60,19 +65,22 @@ fn is_stdout_dev_null() -> bool {
is_dev_null
}
#[cfg(not(any(unix, target_os = "windows")))]
fn is_stdout_dev_null() -> bool {
false
}
pub fn parse_params<I: Iterator<Item = OsString>>(mut opts: Peekable<I>) -> Result<Params, String> {
let Some(executable) = opts.next() else {
return Err("Usage: <exe> <from> <to>".to_string());
};
let executable = opts.next().ok_or("Usage: <exe> <from> <to>".to_string())?;
let executable_str = executable.to_string_lossy().to_string();
let parse_skip = |param: &str, skip_desc: &str| -> Result<usize, String> {
let parse_skip = |param: &str, skip_desc: &str| -> Result<SkipU64, String> {
let suffix_start = param
.find(|b: char| !b.is_ascii_digit())
.unwrap_or(param.len());
let mut num = match param[..suffix_start].parse::<usize>() {
let mut num = match param[..suffix_start].parse::<SkipU64>() {
Ok(num) => num,
Err(e) if *e.kind() == std::num::IntErrorKind::PosOverflow => usize::MAX,
Err(e) if *e.kind() == std::num::IntErrorKind::PosOverflow => SkipU64::MAX,
Err(_) => {
return Err(format!(
"{executable_str}: invalid --ignore-initial value '{skip_desc}'"
@@ -83,33 +91,24 @@ pub fn parse_params<I: Iterator<Item = OsString>>(mut opts: Peekable<I>) -> Resu
if suffix_start != param.len() {
// Note that GNU cmp advertises supporting up to Y, but fails if you try
// to actually use anything beyond E.
let multiplier: usize = match &param[suffix_start..] {
let multiplier: SkipU64 = match &param[suffix_start..] {
"kB" => 1_000,
"K" => 1_024,
"MB" => 1_000_000,
"M" => 1_048_576,
"GB" => 1_000_000_000,
"G" => 1_073_741_824,
// This only generates a warning when compiling for target_pointer_width < 64
#[allow(unused_variables)]
suffix @ ("TB" | "T" | "PB" | "P" | "EB" | "E") => {
#[cfg(target_pointer_width = "64")]
match suffix {
"TB" => 1_000_000_000_000,
"T" => 1_099_511_627_776,
"PB" => 1_000_000_000_000_000,
"P" => 1_125_899_906_842_624,
"EB" => 1_000_000_000_000_000_000,
"E" => 1_152_921_504_606_846_976,
_ => unreachable!(),
}
#[cfg(not(target_pointer_width = "64"))]
usize::MAX
}
"ZB" => usize::MAX, // 1_000_000_000_000_000_000_000,
"Z" => usize::MAX, // 1_180_591_620_717_411_303_424,
"YB" => usize::MAX, // 1_000_000_000_000_000_000_000_000,
"Y" => usize::MAX, // 1_208_925_819_614_629_174_706_176,
"TB" => 1_000_000_000_000,
"T" => 1_099_511_627_776,
"PB" => 1_000_000_000_000_000,
"P" => 1_125_899_906_842_624,
"EB" => 1_000_000_000_000_000_000,
"E" => 1_152_921_504_606_846_976,
// TODO setting usize:MAX does not mimic GNU cmp behavior, it should be an error.
"ZB" => SkipU64::MAX, // 1_000_000_000_000_000_000_000,
"Z" => SkipU64::MAX, // 1_180_591_620_717_411_303_424,
"YB" => SkipU64::MAX, // 1_000_000_000_000_000_000_000_000,
"Y" => SkipU64::MAX, // 1_208_925_819_614_629_174_706_176,
_ => {
return Err(format!(
"{executable_str}: invalid --ignore-initial value '{skip_desc}'"
@@ -119,7 +118,7 @@ pub fn parse_params<I: Iterator<Item = OsString>>(mut opts: Peekable<I>) -> Resu
num = match num.overflowing_mul(multiplier) {
(n, false) => n,
_ => usize::MAX,
_ => SkipU64::MAX,
}
}
@@ -173,9 +172,10 @@ pub fn parse_params<I: Iterator<Item = OsString>>(mut opts: Peekable<I>) -> Resu
let (_, arg) = param_str.split_once('=').unwrap();
arg.to_string()
};
let max_bytes = match max_bytes.parse::<usize>() {
let max_bytes = match max_bytes.parse::<BytesLimitU64>() {
Ok(num) => num,
Err(e) if *e.kind() == std::num::IntErrorKind::PosOverflow => usize::MAX,
// TODO limit to MAX is dangerous, this should become an error like in GNU cmp.
Err(e) if *e.kind() == std::num::IntErrorKind::PosOverflow => BytesLimitU64::MAX,
Err(_) => {
return Err(format!(
"{executable_str}: invalid --bytes value '{max_bytes}'"
@@ -217,7 +217,7 @@ pub fn parse_params<I: Iterator<Item = OsString>>(mut opts: Peekable<I>) -> Resu
std::process::exit(0);
}
if param_str.starts_with('-') {
return Err(format!("Unknown option: {param:?}"));
return Err(format!("unrecognized option '{}'", param.to_string_lossy()));
}
if from.is_none() {
from = Some(param);
@@ -233,7 +233,7 @@ pub fn parse_params<I: Iterator<Item = OsString>>(mut opts: Peekable<I>) -> Resu
}
// Do as GNU cmp, and completely disable printing if we are
// outputing to /dev/null.
// outputting to /dev/null.
#[cfg(not(target_os = "windows"))]
if is_stdout_dev_null() {
params.quiet = true;
@@ -285,32 +285,21 @@ pub fn parse_params<I: Iterator<Item = OsString>>(mut opts: Peekable<I>) -> Resu
fn prepare_reader(
path: &OsString,
skip: &Option<usize>,
skip: &Option<SkipU64>,
params: &Params,
) -> Result<Box<dyn BufRead>, String> {
let mut reader: Box<dyn BufRead> = if path == "-" {
Box::new(BufReader::new(io::stdin()))
} else {
match fs::File::open(path) {
Ok(file) => Box::new(BufReader::new(file)),
Err(e) => {
return Err(format_failure_to_read_input_file(
&params.executable,
path,
&e,
));
}
}
let file = fs::File::open(path)
.map_err(|e| format_failure_to_read_input_file(&params.executable, path, &e))?;
Box::new(BufReader::new(file))
};
if let Some(skip) = skip {
if let Err(e) = io::copy(&mut reader.by_ref().take(*skip as u64), &mut io::sink()) {
return Err(format_failure_to_read_input_file(
&params.executable,
path,
&e,
));
}
// cast as u64 must remain, because value of IgnInit data type could be changed.
io::copy(&mut reader.by_ref().take(*skip), &mut io::sink())
.map_err(|e| format_failure_to_read_input_file(&params.executable, path, &e))?;
}
Ok(reader)
@@ -326,11 +315,11 @@ pub fn cmp(params: &Params) -> Result<Cmp, String> {
let mut from = prepare_reader(&params.from, &params.skip_a, params)?;
let mut to = prepare_reader(&params.to, &params.skip_b, params)?;
let mut offset_width = params.max_bytes.unwrap_or(usize::MAX);
let mut offset_width = params.max_bytes.unwrap_or(BytesLimitU64::MAX);
if let (Ok(a_meta), Ok(b_meta)) = (fs::metadata(&params.from), fs::metadata(&params.to)) {
#[cfg(not(target_os = "windows"))]
let (a_size, b_size) = (a_meta.size(), b_meta.size());
let (a_size, b_size) = (a_meta.len(), b_meta.len());
#[cfg(target_os = "windows")]
let (a_size, b_size) = (a_meta.file_size(), b_meta.file_size());
@@ -341,7 +330,7 @@ pub fn cmp(params: &Params) -> Result<Cmp, String> {
return Ok(Cmp::Different);
}
let smaller = cmp::min(a_size, b_size) as usize;
let smaller = cmp::min(a_size, b_size) as BytesLimitU64;
offset_width = cmp::min(smaller, offset_width);
}
@@ -350,34 +339,20 @@ pub fn cmp(params: &Params) -> Result<Cmp, String> {
// Capacity calc: at_byte width + 2 x 3-byte octal numbers + 2 x 4-byte value + 4 spaces
let mut output = Vec::<u8>::with_capacity(offset_width + 3 * 2 + 4 * 2 + 4);
let mut at_byte = 1;
let mut at_line = 1;
let mut at_byte: BytesLimitU64 = 1;
let mut at_line: u64 = 1;
let mut start_of_line = true;
let mut stdout = BufWriter::new(io::stdout().lock());
let mut compare = Cmp::Equal;
loop {
// Fill up our buffers.
let from_buf = match from.fill_buf() {
Ok(buf) => buf,
Err(e) => {
return Err(format_failure_to_read_input_file(
&params.executable,
&params.from,
&e,
));
}
};
let from_buf = from
.fill_buf()
.map_err(|e| format_failure_to_read_input_file(&params.executable, &params.from, &e))?;
let to_buf = match to.fill_buf() {
Ok(buf) => buf,
Err(e) => {
return Err(format_failure_to_read_input_file(
&params.executable,
&params.to,
&e,
));
}
};
let to_buf = to
.fill_buf()
.map_err(|e| format_failure_to_read_input_file(&params.executable, &params.to, &e))?;
// Check for EOF conditions.
if from_buf.is_empty() && to_buf.is_empty() {
@@ -401,8 +376,8 @@ pub fn cmp(params: &Params) -> Result<Cmp, String> {
if from_buf[..consumed] == to_buf[..consumed] {
let last = from_buf[..consumed].last().unwrap();
at_byte += consumed;
at_line += from_buf[..consumed].iter().filter(|&c| *c == b'\n').count();
at_byte += consumed as BytesLimitU64;
at_line += (from_buf[..consumed].iter().filter(|&c| *c == b'\n').count()) as u64;
start_of_line = *last == b'\n';
@@ -500,12 +475,6 @@ pub fn main(opts: Peekable<ArgsOs>) -> ExitCode {
}
}
#[inline]
fn is_ascii_printable(byte: u8) -> bool {
let c = byte as char;
c.is_ascii() && !c.is_ascii_control()
}
#[inline]
fn format_octal(byte: u8, buf: &mut [u8; 3]) -> &str {
*buf = [b' ', b' ', b'0'];
@@ -525,32 +494,68 @@ fn format_octal(byte: u8, buf: &mut [u8; 3]) -> &str {
}
#[inline]
fn format_byte(byte: u8) -> String {
let mut byte = byte;
let mut quoted = vec![];
if !is_ascii_printable(byte) {
if byte >= 128 {
quoted.push(b'M');
quoted.push(b'-');
byte -= 128;
fn write_visible_byte(output: &mut Vec<u8>, byte: u8) -> usize {
match byte {
// Control characters: ^@, ^A, ..., ^_
0..=31 => {
output.push(b'^');
output.push(byte + 64);
2
}
if byte < 32 {
quoted.push(b'^');
byte += 64;
} else if byte == 127 {
quoted.push(b'^');
byte = b'?';
// Printable ASCII (space through ~)
32..=126 => {
output.push(byte);
1
}
// DEL: ^?
127 => {
output.extend_from_slice(b"^?");
2
}
// High bytes with control equivalents: M-^@, M-^A, ..., M-^_
128..=159 => {
output.push(b'M');
output.push(b'-');
output.push(b'^');
output.push(byte - 64);
4
}
// High bytes: M-<space>, M-!, ..., M-~
160..=254 => {
output.push(b'M');
output.push(b'-');
output.push(byte - 128);
3
}
// Byte 255: M-^?
255 => {
output.extend_from_slice(b"M-^?");
4
}
assert!((byte as char).is_ascii());
}
}
quoted.push(byte);
/// Writes a byte in visible form with right-padding to 4 spaces.
#[inline]
fn write_visible_byte_padded(output: &mut Vec<u8>, byte: u8) {
const SPACES: &[u8] = b" ";
const WIDTH: usize = SPACES.len();
// SAFETY: the checks and shifts we do above match what cat and GNU
let display_width = write_visible_byte(output, byte);
// Add right-padding spaces
let padding = WIDTH.saturating_sub(display_width);
output.extend_from_slice(&SPACES[..padding]);
}
/// Formats a byte as a visible string (for non-performance-critical path)
#[inline]
fn format_visible_byte(byte: u8) -> String {
let mut result = Vec::with_capacity(4);
write_visible_byte(&mut result, byte);
// SAFETY: the checks and shifts in write_visible_byte match what cat and GNU
// cmp do to ensure characters fall inside the ascii range.
unsafe { String::from_utf8_unchecked(quoted) }
unsafe { String::from_utf8_unchecked(result) }
}
// This function has been optimized to not use the Rust fmt system, which
@@ -560,7 +565,7 @@ fn format_byte(byte: u8) -> String {
fn format_verbose_difference(
from_byte: u8,
to_byte: u8,
at_byte: usize,
at_byte: BytesLimitU64,
offset_width: usize,
output: &mut Vec<u8>,
params: &Params,
@@ -588,14 +593,7 @@ fn format_verbose_difference(
output.push(b' ');
let from_byte_str = format_byte(from_byte);
let from_byte_padding = 4 - from_byte_str.len();
output.extend_from_slice(from_byte_str.as_bytes());
for _ in 0..from_byte_padding {
output.push(b' ')
}
write_visible_byte_padded(output, from_byte);
output.push(b' ');
@@ -603,7 +601,7 @@ fn format_verbose_difference(
output.push(b' ');
output.extend_from_slice(format_byte(to_byte).as_bytes());
write_visible_byte(output, to_byte);
output.push(b'\n');
} else {
@@ -632,7 +630,13 @@ fn format_verbose_difference(
}
#[inline]
fn report_eof(at_byte: usize, at_line: usize, start_of_line: bool, eof_on: &str, params: &Params) {
fn report_eof(
at_byte: BytesLimitU64,
at_line: u64,
start_of_line: bool,
eof_on: &str,
params: &Params,
) {
if params.quiet {
return;
}
@@ -684,7 +688,13 @@ fn is_posix_locale() -> bool {
}
#[inline]
fn report_difference(from_byte: u8, to_byte: u8, at_byte: usize, at_line: usize, params: &Params) {
fn report_difference(
from_byte: u8,
to_byte: u8,
at_byte: BytesLimitU64,
at_line: u64,
params: &Params,
) {
if params.quiet {
return;
}
@@ -706,9 +716,9 @@ fn report_difference(from_byte: u8, to_byte: u8, at_byte: usize, at_line: usize,
print!(
" is {:>3o} {:char_width$} {:>3o} {:char_width$}",
from_byte,
format_byte(from_byte),
format_visible_byte(from_byte),
to_byte,
format_byte(to_byte)
format_visible_byte(to_byte)
);
}
println!();
@@ -781,7 +791,7 @@ mod tests {
from: os("foo"),
to: os("bar"),
skip_a: Some(1),
skip_b: Some(usize::MAX),
skip_b: Some(SkipU64::MAX),
..Default::default()
}),
parse_params(
@@ -959,7 +969,7 @@ mod tests {
executable: os("cmp"),
from: os("foo"),
to: os("bar"),
max_bytes: Some(usize::MAX),
max_bytes: Some(BytesLimitU64::MAX),
..Default::default()
}),
parse_params(
@@ -976,6 +986,7 @@ mod tests {
);
// Failure case
// TODO This is actually fine in GNU cmp. --bytes does not have a unit parser yet.
assert_eq!(
Err("cmp: invalid --bytes value '1K'".to_string()),
parse_params(
@@ -1021,8 +1032,8 @@ mod tests {
executable: os("cmp"),
from: os("foo"),
to: os("bar"),
skip_a: Some(usize::MAX),
skip_b: Some(usize::MAX),
skip_a: Some(SkipU64::MAX),
skip_b: Some(SkipU64::MAX),
..Default::default()
}),
parse_params(
@@ -1062,6 +1073,9 @@ mod tests {
from: os("foo"),
to: os("bar"),
skip_a: Some(1_000_000_000),
#[cfg(target_pointer_width = "32")]
skip_b: Some((2_147_483_647.5 * 2.0) as usize),
#[cfg(target_pointer_width = "64")]
skip_b: Some(1_152_921_504_606_846_976 * 2),
..Default::default()
}),
@@ -1093,8 +1107,12 @@ mod tests {
.enumerate()
{
let values = [
1_000usize.checked_pow((i + 1) as u32).unwrap_or(usize::MAX),
1024usize.checked_pow((i + 1) as u32).unwrap_or(usize::MAX),
(1_000 as SkipU64)
.checked_pow((i + 1) as u32)
.unwrap_or(SkipU64::MAX),
(1024 as SkipU64)
.checked_pow((i + 1) as u32)
.unwrap_or(SkipU64::MAX),
];
for (j, v) in values.iter().enumerate() {
assert_eq!(
+23 -16
View File
@@ -381,6 +381,9 @@ pub fn diff(expected: &[u8], actual: &[u8], params: &Params) -> Vec<u8> {
mod tests {
use super::*;
use pretty_assertions::assert_eq;
use crate::utils::testcmds::PATCH_CMD;
#[test]
fn test_permutations() {
// test all possible six-line files.
@@ -394,7 +397,6 @@ mod tests {
for &f in &[0, 1, 2] {
use std::fs::{self, File};
use std::io::Write;
use std::process::Command;
let mut alef = Vec::new();
let mut bet = Vec::new();
alef.write_all(if a == 0 { b"a\n" } else { b"b\n" })
@@ -429,12 +431,13 @@ mod tests {
}
// This test diff is intentionally reversed.
// We want it to turn the alef into bet.
let patched = &format!("{target}/alef");
let diff = diff(
&alef,
&bet,
&Params {
from: "a/alef".into(),
to: (&format!("{target}/alef")).into(),
from: patched.into(),
to: patched.into(),
context_count: 2,
..Default::default()
},
@@ -449,7 +452,8 @@ mod tests {
fb.write_all(&bet[..]).unwrap();
let _ = fa;
let _ = fb;
let output = Command::new("patch")
let output = PATCH_CMD
.new()
.arg("-p0")
.arg("--context")
.stdin(File::open(format!("{target}/ab.diff")).unwrap())
@@ -481,7 +485,6 @@ mod tests {
for &f in &[0, 1, 2] {
use std::fs::{self, File};
use std::io::Write;
use std::process::Command;
let mut alef = Vec::new();
let mut bet = Vec::new();
alef.write_all(if a == 0 { b"\n" } else { b"b\n" }).unwrap();
@@ -510,12 +513,13 @@ mod tests {
}
// This test diff is intentionally reversed.
// We want it to turn the alef into bet.
let patched = &format!("{target}/alef_");
let diff = diff(
&alef,
&bet,
&Params {
from: "a/alef_".into(),
to: (&format!("{target}/alef_")).into(),
from: patched.into(),
to: patched.into(),
context_count: 2,
..Default::default()
},
@@ -530,7 +534,8 @@ mod tests {
fb.write_all(&bet[..]).unwrap();
let _ = fa;
let _ = fb;
let output = Command::new("patch")
let output = PATCH_CMD
.new()
.arg("-p0")
.arg("--context")
.stdin(File::open(format!("{target}/ab_.diff")).unwrap())
@@ -562,7 +567,6 @@ mod tests {
for &f in &[0, 1, 2] {
use std::fs::{self, File};
use std::io::Write;
use std::process::Command;
let mut alef = Vec::new();
let mut bet = Vec::new();
alef.write_all(if a == 0 { b"a\n" } else { b"" }).unwrap();
@@ -594,12 +598,13 @@ mod tests {
};
// This test diff is intentionally reversed.
// We want it to turn the alef into bet.
let patched = &format!("{target}/alefx");
let diff = diff(
&alef,
&bet,
&Params {
from: "a/alefx".into(),
to: (&format!("{target}/alefx")).into(),
from: patched.into(),
to: patched.into(),
context_count: 2,
..Default::default()
},
@@ -614,7 +619,8 @@ mod tests {
fb.write_all(&bet[..]).unwrap();
let _ = fa;
let _ = fb;
let output = Command::new("patch")
let output = PATCH_CMD
.new()
.arg("-p0")
.arg("--context")
.stdin(File::open(format!("{target}/abx.diff")).unwrap())
@@ -646,7 +652,6 @@ mod tests {
for &f in &[0, 1, 2] {
use std::fs::{self, File};
use std::io::Write;
use std::process::Command;
let mut alef = Vec::new();
let mut bet = Vec::new();
alef.write_all(if a == 0 { b"a\n" } else { b"f\n" })
@@ -681,12 +686,13 @@ mod tests {
}
// This test diff is intentionally reversed.
// We want it to turn the alef into bet.
let alefr_path = &format!("{target}/alefr");
let diff = diff(
&alef,
&bet,
&Params {
from: "a/alefr".into(),
to: (&format!("{target}/alefr")).into(),
from: alefr_path.into(),
to: alefr_path.into(),
context_count: 2,
..Default::default()
},
@@ -701,7 +707,8 @@ mod tests {
fb.write_all(&bet[..]).unwrap();
let _ = fa;
let _ = fb;
let output = Command::new("patch")
let output = PATCH_CMD
.new()
.arg("-p0")
.arg("--context")
.stdin(File::open(format!("{target}/abr.diff")).unwrap())
+9 -6
View File
@@ -162,6 +162,9 @@ pub fn diff(expected: &[u8], actual: &[u8], params: &Params) -> Result<Vec<u8>,
mod tests {
use super::*;
use pretty_assertions::assert_eq;
use crate::utils::testcmds::ED_CMD;
pub fn diff_w(expected: &[u8], actual: &[u8], filename: &str) -> Result<Vec<u8>, DiffError> {
let mut output = diff(expected, actual, &Params::default())?;
writeln!(&mut output, "w {filename}").unwrap();
@@ -237,8 +240,8 @@ mod tests {
let _ = fb;
#[cfg(not(windows))] // there's no ed on windows
{
use std::process::Command;
let output = Command::new("ed")
let output = ED_CMD
.new()
.arg(format!("{target}/alef"))
.stdin(File::open(format!("{target}/ab.ed")).unwrap())
.output()
@@ -311,8 +314,8 @@ mod tests {
let _ = fb;
#[cfg(not(windows))] // there's no ed on windows
{
use std::process::Command;
let output = Command::new("ed")
let output = ED_CMD
.new()
.arg(format!("{target}/alef_"))
.stdin(File::open(format!("{target}/ab_.ed")).unwrap())
.output()
@@ -391,8 +394,8 @@ mod tests {
let _ = fb;
#[cfg(not(windows))] // there's no ed on windows
{
use std::process::Command;
let output = Command::new("ed")
let output = ED_CMD
.new()
.arg(format!("{target}/alefr"))
.stdin(File::open(format!("{target}/abr.ed")).unwrap())
.output()
+11 -7
View File
@@ -58,7 +58,7 @@ fn main() -> ExitCode {
let exe_path = binary_path(&mut args);
let exe_name = name(&exe_path);
let util_name = if exe_name == "diffutils" {
let util_name = if exe_name.as_encoded_bytes().ends_with(b"diffutils") {
// Discard the item we peeked.
let _ = args.next();
@@ -69,13 +69,17 @@ fn main() -> ExitCode {
OsString::from(exe_name)
};
match util_name.to_str() {
Some("diff") => diff::main(args),
Some("cmp") => cmp::main(args),
Some(name) => {
eprintln!("{name}: utility not supported");
match util_name.as_encoded_bytes() {
name if name.ends_with(b"diff") => diff::main(args),
name if name.ends_with(b"cmp") => cmp::main(args),
name => {
use std::io::{stderr, Write as _};
let _ = writeln!(
stderr(),
"{}: utility not supported",
String::from_utf8_lossy(name)
);
ExitCode::from(2)
}
None => second_arg_error(exe_name),
}
}
+10 -8
View File
@@ -215,6 +215,8 @@ mod tests {
use super::*;
use pretty_assertions::assert_eq;
use crate::utils::testcmds::PATCH_CMD;
#[test]
fn test_basic() {
let mut a = Vec::new();
@@ -239,7 +241,6 @@ mod tests {
for &f in &[0, 1, 2] {
use std::fs::{self, File};
use std::io::Write;
use std::process::Command;
let mut alef = Vec::new();
let mut bet = Vec::new();
alef.write_all(if a == 0 { b"a\n" } else { b"b\n" })
@@ -285,7 +286,8 @@ mod tests {
fb.write_all(&bet[..]).unwrap();
let _ = fa;
let _ = fb;
let output = Command::new("patch")
let output = PATCH_CMD
.new()
.arg("-p0")
.arg(format!("{target}/alef"))
.stdin(File::open(format!("{target}/ab.diff")).unwrap())
@@ -318,7 +320,6 @@ mod tests {
for &g in &[0, 1, 2] {
use std::fs::{self, File};
use std::io::Write;
use std::process::Command;
let mut alef = Vec::new();
let mut bet = Vec::new();
alef.write_all(if a == 0 { b"a\n" } else { b"b\n" })
@@ -377,7 +378,8 @@ mod tests {
fb.write_all(&bet[..]).unwrap();
let _ = fa;
let _ = fb;
let output = Command::new("patch")
let output = PATCH_CMD
.new()
.arg("-p0")
.arg("--normal")
.arg(format!("{target}/alefn"))
@@ -411,7 +413,6 @@ mod tests {
for &f in &[0, 1, 2] {
use std::fs::{self, File};
use std::io::Write;
use std::process::Command;
let mut alef = Vec::new();
let mut bet = Vec::new();
alef.write_all(if a == 0 { b"\n" } else { b"b\n" }).unwrap();
@@ -451,7 +452,8 @@ mod tests {
fb.write_all(&bet[..]).unwrap();
let _ = fa;
let _ = fb;
let output = Command::new("patch")
let output = PATCH_CMD
.new()
.arg("-p0")
.arg(format!("{target}/alef_"))
.stdin(File::open(format!("{target}/ab_.diff")).unwrap())
@@ -483,7 +485,6 @@ mod tests {
for &f in &[0, 1, 2] {
use std::fs::{self, File};
use std::io::Write;
use std::process::Command;
let mut alef = Vec::new();
let mut bet = Vec::new();
alef.write_all(if a == 0 { b"a\n" } else { b"f\n" })
@@ -529,7 +530,8 @@ mod tests {
fb.write_all(&bet[..]).unwrap();
let _ = fa;
let _ = fb;
let output = Command::new("patch")
let output = PATCH_CMD
.new()
.arg("-p0")
.arg(format!("{target}/alefr"))
.stdin(File::open(format!("{target}/abr.diff")).unwrap())
+1 -1
View File
@@ -195,7 +195,7 @@ pub fn parse_params<I: Iterator<Item = OsString>>(mut opts: Peekable<I>) -> Resu
Err(error) => return Err(error),
}
if param.to_string_lossy().starts_with('-') {
return Err(format!("Unknown option: {param:?}"));
return Err(format!("unrecognized option '{}'", param.to_string_lossy()));
}
if from.is_none() {
from = Some(param);
+29 -20
View File
@@ -408,6 +408,8 @@ mod tests {
use super::*;
use pretty_assertions::assert_eq;
use crate::utils::testcmds::PATCH_CMD;
#[test]
fn test_permutations() {
let target = "target/unified-diff/";
@@ -421,7 +423,6 @@ mod tests {
for &f in &[0, 1, 2] {
use std::fs::{self, File};
use std::io::Write;
use std::process::Command;
let mut alef = Vec::new();
let mut bet = Vec::new();
alef.write_all(if a == 0 { b"a\n" } else { b"b\n" })
@@ -456,12 +457,13 @@ mod tests {
}
// This test diff is intentionally reversed.
// We want it to turn the alef into bet.
let patched = &format!("{target}/alef");
let diff = diff(
&alef,
&bet,
&Params {
from: "a/alef".into(),
to: (&format!("{target}/alef")).into(),
from: patched.into(),
to: patched.into(),
context_count: 2,
..Default::default()
},
@@ -492,7 +494,10 @@ mod tests {
.unwrap_or_else(|_| String::from("[Invalid UTF-8]"))
);
let output = Command::new("patch")
use crate::utils::testcmds::PATCH_CMD;
let output = PATCH_CMD
.new()
.arg("-p0")
.stdin(File::open(format!("{target}/ab.diff")).unwrap())
.output()
@@ -524,7 +529,6 @@ mod tests {
for &g in &[0, 1, 2] {
use std::fs::{self, File};
use std::io::Write;
use std::process::Command;
let mut alef = Vec::new();
let mut bet = Vec::new();
alef.write_all(if a == 0 { b"a\n" } else { b"b\n" })
@@ -572,12 +576,13 @@ mod tests {
}
// This test diff is intentionally reversed.
// We want it to turn the alef into bet.
let patched = &format!("{target}/alefn");
let diff = diff(
&alef,
&bet,
&Params {
from: "a/alefn".into(),
to: (&format!("{target}/alefn")).into(),
from: patched.into(),
to: patched.into(),
context_count: 2,
..Default::default()
},
@@ -592,7 +597,8 @@ mod tests {
fb.write_all(&bet[..]).unwrap();
let _ = fa;
let _ = fb;
let output = Command::new("patch")
let output = PATCH_CMD
.new()
.arg("-p0")
.stdin(File::open(format!("{target}/abn.diff")).unwrap())
.output()
@@ -625,7 +631,6 @@ mod tests {
for &g in &[0, 1, 2, 3] {
use std::fs::{self, File};
use std::io::Write;
use std::process::Command;
let mut alef = Vec::new();
let mut bet = Vec::new();
alef.write_all(if a == 0 { b"\n" } else { b"b\n" }).unwrap();
@@ -668,12 +673,13 @@ mod tests {
}
// This test diff is intentionally reversed.
// We want it to turn the alef into bet.
let patched = &format!("{target}/alef_");
let diff = diff(
&alef,
&bet,
&Params {
from: "a/alef_".into(),
to: (&format!("{target}/alef_")).into(),
from: patched.into(),
to: patched.into(),
context_count: 2,
..Default::default()
},
@@ -688,7 +694,8 @@ mod tests {
fb.write_all(&bet[..]).unwrap();
let _ = fa;
let _ = fb;
let output = Command::new("patch")
let output = PATCH_CMD
.new()
.arg("-p0")
.stdin(File::open(format!("{target}/ab_.diff")).unwrap())
.output()
@@ -720,7 +727,6 @@ mod tests {
for &f in &[0, 1, 2] {
use std::fs::{self, File};
use std::io::Write;
use std::process::Command;
let mut alef = Vec::new();
let mut bet = Vec::new();
alef.write_all(if a == 0 { b"a\n" } else { b"" }).unwrap();
@@ -749,12 +755,13 @@ mod tests {
}
// This test diff is intentionally reversed.
// We want it to turn the alef into bet.
let patched = &format!("{target}/alefx");
let diff = diff(
&alef,
&bet,
&Params {
from: "a/alefx".into(),
to: (&format!("{target}/alefx")).into(),
from: patched.into(),
to: patched.into(),
context_count: 2,
..Default::default()
},
@@ -769,7 +776,8 @@ mod tests {
fb.write_all(&bet[..]).unwrap();
let _ = fa;
let _ = fb;
let output = Command::new("patch")
let output = PATCH_CMD
.new()
.arg("-p0")
.stdin(File::open(format!("{target}/abx.diff")).unwrap())
.output()
@@ -800,7 +808,6 @@ mod tests {
for &f in &[0, 1, 2] {
use std::fs::{self, File};
use std::io::Write;
use std::process::Command;
let mut alef = Vec::new();
let mut bet = Vec::new();
alef.write_all(if a == 0 { b"a\n" } else { b"f\n" })
@@ -835,12 +842,13 @@ mod tests {
}
// This test diff is intentionally reversed.
// We want it to turn the alef into bet.
let patched = &format!("{target}/alefr");
let diff = diff(
&alef,
&bet,
&Params {
from: "a/alefr".into(),
to: (&format!("{target}/alefr")).into(),
from: patched.into(),
to: patched.into(),
context_count: 2,
..Default::default()
},
@@ -855,7 +863,8 @@ mod tests {
fb.write_all(&bet[..]).unwrap();
let _ = fa;
let _ = fb;
let output = Command::new("patch")
let output = PATCH_CMD
.new()
.arg("-p0")
.stdin(File::open(format!("{target}/abr.diff")).unwrap())
.output()
+93
View File
@@ -98,6 +98,99 @@ pub fn report_failure_to_read_input_file(
);
}
#[cfg(test)]
pub mod testcmds {
// Command construction wrapper that provides some validation and non-obscure, "fail fast"
// feedback and error messages.
use std::any::Any;
use std::io::Write;
use std::panic::catch_unwind;
use std::process::{Command, Stdio};
use std::sync::LazyLock;
pub struct CmdFactory {
cmd: &'static str,
validated_once: LazyLock<Result<(), String>>,
validate: fn(&CmdFactory) -> (),
}
impl CmdFactory {
pub fn new(&self) -> Command {
match &*self.validated_once {
Ok(()) => Command::new(self.cmd),
Err(errmsg) => panic!(
"'{}' validation failed in earlier thread/test: {}",
self.cmd, errmsg
),
}
}
// "self" is not compatible with static initialization
fn try_catch_validate(cmd: &CmdFactory) -> Result<(), String> {
// Note catch_unwind() does _not_ hide error messages, stack traces, etc.
catch_unwind(|| {
let _ = (cmd.validate)(cmd);
})
.map_err(find_panic_message)
}
}
fn find_panic_message(payload: Box<dyn Any + Send>) -> String {
// https://github.com/rust-lang/rust/blob/1.95.0/library/std/src/panicking.rs#L771
if let Some(&s) = payload.downcast_ref::<&'static str>() {
String::from(s)
} else if let Some(s) = payload.downcast_ref::<String>() {
s.clone()
} else {
format!(
"Unusual panic payload type {:?}, look for the first thread/test that failed",
payload.type_id(),
)
}
}
pub static PATCH_CMD: CmdFactory = CmdFactory {
cmd: if cfg!(target_os = "macos") {
"gpatch" // brew install gpatch
} else {
"patch"
},
validated_once: LazyLock::new(|| CmdFactory::try_catch_validate(&PATCH_CMD)),
validate: (|myself| {
let output = Command::new(myself.cmd)
.arg("--version")
.output()
.expect(format!("`{} --version` failed", myself.cmd).as_str());
// Non-GNU versions have subtle differences. When some newlines are missing in some test
// patches, the macOS version can even stall the whole test run.
assert!(output.stdout.starts_with(b"GNU patch"));
assert!(output.status.success());
}),
};
pub static ED_CMD: CmdFactory = CmdFactory {
cmd: "ed",
validated_once: LazyLock::new(|| CmdFactory::try_catch_validate(&ED_CMD)),
validate: (|myself| {
let mut child = Command::new(myself.cmd)
.arg("!echo hello_ed")
.stdin(Stdio::piped())
.stdout(Stdio::piped())
.spawn()
.expect("Failed to start 'ed' command");
let mut stdin = child.stdin.take().unwrap();
writeln!(stdin, "1p\nq").expect("Failed to send command to 'ed'");
let output = child
.wait_with_output()
.expect("Failed to read 'ed' stdout");
assert_eq!(String::from_utf8_lossy(&output.stdout), "9\nhello_ed\n");
}),
};
}
#[cfg(test)]
mod tests {
use super::*;
+2 -2
View File
@@ -32,14 +32,14 @@ mod common {
"Expected utility name as second argument, got nothing.\n",
));
for subcmd in ["diff", "cmp"] {
for subcmd in ["diff", "cmp", "uu-diff", "uucmp"] {
let mut cmd = cargo_bin_cmd!("diffutils");
cmd.arg(subcmd);
cmd.arg("--foobar");
cmd.assert()
.code(predicate::eq(2))
.failure()
.stderr(predicate::str::starts_with("Unknown option: \"--foobar\""));
.stderr(predicate::str::contains("unrecognized option '--foobar'"));
}
Ok(())
}
+1 -1
View File
@@ -62,7 +62,7 @@ cd ../tests
# Fetch tests/init.sh from the gnulib repository (needed since
# https://git.savannah.gnu.org/cgit/diffutils.git/commit/tests?id=1d2456f539)
curl -s "$gitserver/gitweb/?p=gnulib.git;a=blob_plain;f=tests/init.sh;hb=HEAD" -o init.sh
curl -sL "$gitserver/gitweb/?p=gnulib.git;a=blob_plain;f=tests/init.sh;hb=HEAD" -o init.sh
if [[ -n "$TESTS" ]]
then
+158
View File
@@ -0,0 +1,158 @@
#!/usr/bin/env python3
"""
Compare the current GNU test results to the last results gathered from the main branch to
highlight if a PR is making the results better/worse.
Don't exit with error code if all failing tests are in the ignore-intermittent.txt list.
"""
import json
import sys
import argparse
from pathlib import Path
def load_ignore_list(ignore_file):
"""Load list of intermittent test names to ignore from file."""
ignore_set = set()
if ignore_file and Path(ignore_file).exists():
with open(ignore_file, "r") as f:
for line in f:
line = line.strip()
if line and not line.startswith("#"):
ignore_set.add(line)
return ignore_set
def extract_test_results(json_data):
"""Extract test results from a diffutils test-results.json.
Note: unlike sed, diffutils JSON has no 'summary' object results are
computed from the 'tests' array using the 'result' and 'test' fields.
"""
tests = json_data.get("tests", [])
passed = sum(1 for t in tests if t.get("result") == "PASS")
failed = sum(1 for t in tests if t.get("result") == "FAIL")
skipped = sum(1 for t in tests if t.get("result") == "SKIP")
summary = {"total": len(tests), "passed": passed, "failed": failed, "skipped": skipped}
failed_tests = [t["test"] for t in tests if t.get("result") == "FAIL"]
return summary, failed_tests
def compare_results(current_file, reference_file, ignore_file=None, output_file=None):
"""Compare current results with reference results."""
ignore_set = load_ignore_list(ignore_file)
try:
with open(current_file, "r") as f:
current_data = json.load(f)
current_summary, current_failed = extract_test_results(current_data)
except Exception as e:
print(f"Error loading current results: {e}")
return 1
try:
with open(reference_file, "r") as f:
reference_data = json.load(f)
reference_summary, reference_failed = extract_test_results(reference_data)
except Exception as e:
print(f"Error loading reference results: {e}")
return 1
# Calculate differences
pass_diff = int(current_summary.get("passed", 0)) - int(reference_summary.get("passed", 0))
fail_diff = int(current_summary.get("failed", 0)) - int(reference_summary.get("failed", 0))
total_diff = int(current_summary.get("total", 0)) - int(reference_summary.get("total", 0))
# Find new failures and improvements
current_failed_set = set(current_failed)
reference_failed_set = set(reference_failed)
new_failures = current_failed_set - reference_failed_set
improvements = reference_failed_set - current_failed_set
# Filter out intermittent failures
non_intermittent_new_failures = new_failures - ignore_set
# Check if results are identical (no changes)
no_changes = (
pass_diff == 0
and fail_diff == 0
and total_diff == 0
and not new_failures
and not improvements
)
# If no changes, write empty output to prevent comment posting
if no_changes:
if output_file:
with open(output_file, "w") as f:
f.write("")
return 0
# Prepare output message
output_lines = []
output_lines.append("Test results comparison:")
output_lines.append(
f" Current: TOTAL: {current_summary.get('total', 0)} / PASSED: {current_summary.get('passed', 0)} / FAILED: {current_summary.get('failed', 0)} / SKIPPED: {current_summary.get('skipped', 0)}"
)
output_lines.append(
f" Reference: TOTAL: {reference_summary.get('total', 0)} / PASSED: {reference_summary.get('passed', 0)} / FAILED: {reference_summary.get('failed', 0)} / SKIPPED: {reference_summary.get('skipped', 0)}"
)
output_lines.append("")
if pass_diff != 0 or fail_diff != 0 or total_diff != 0:
output_lines.append("Changes from main branch:")
output_lines.append(f" TOTAL: {total_diff:+d}")
output_lines.append(f" PASSED: {pass_diff:+d}")
output_lines.append(f" FAILED: {fail_diff:+d}")
output_lines.append("")
if new_failures:
output_lines.append(f"New test failures ({len(new_failures)}):")
for test in sorted(new_failures):
if test in ignore_set:
output_lines.append(f" - {test} (intermittent)")
else:
output_lines.append(f" - {test}")
output_lines.append("")
if improvements:
output_lines.append(f"Test improvements ({len(improvements)}):")
for test in sorted(improvements):
output_lines.append(f" + {test}")
output_lines.append("")
output_text = "\n".join(output_lines)
if output_file:
with open(output_file, "w") as f:
f.write(output_text)
else:
print(output_text)
if non_intermittent_new_failures:
print(
f"ERROR: Found {len(non_intermittent_new_failures)} new non-intermittent test failures"
)
return 1
return 0
def main():
parser = argparse.ArgumentParser(description="Compare GNU diffutils test results")
parser.add_argument("current", help="Current test results JSON file")
parser.add_argument("reference", help="Reference test results JSON file")
parser.add_argument(
"--ignore-file", help="File containing intermittent test names to ignore"
)
parser.add_argument("--output", help="Output file for comparison results")
args = parser.parse_args()
return compare_results(args.current, args.reference, args.ignore_file, args.output)
if __name__ == "__main__":
sys.exit(main())