mirror of
https://github.com/uutils/diffutils.git
synced 2026-06-29 07:08:33 -04:00
a316262603
Before this change, we would first find all changes so we could obtain the largest offset we will report and use that to set up the padding. Now we use the file sizes to estimate the largest possible offset. Not only does this allow us to print earlier, reduces memory usage, as we do not store diffs to report later, but it also fixes a case in which our output was different to GNU cmp's - because it also seems to estimate based on size. Memory usage drops by a factor of 1000(!), without losing performance while comparing 2 binaries of hundreds of MBs: Before: Maximum resident set size (kbytes): 2489260 Benchmark 1: ../target/release/diffutils \ cmp -l -b /usr/lib64/chromium-browser/chromium-browser /usr/lib64/firefox/libxul.so Time (mean ± σ): 14.466 s ± 0.166 s [User: 12.367 s, System: 2.012 s] Range (min … max): 14.350 s … 14.914 s 10 runs After: Maximum resident set size (kbytes): 2636 Benchmark 1: ../target/release/diffutils \ cmp -l -b /usr/lib64/chromium-browser/chromium-browser /usr/lib64/firefox/libxul.so Time (mean ± σ): 13.724 s ± 0.038 s [User: 12.263 s, System: 1.372 s] Range (min … max): 13.667 s … 13.793 s 10 runs