Contents|Index|Previous|Next
diff Performance Tradeoffs
GNU diff runs quite efficiently; however, in some circumstances you can cause it to
run faster or produce a more compact set of changes. There are two ways that you
can affect the performance of GNU diff by changing the way it compares files.
Performance has more than one dimension. These options improve one aspect of
performance at the cost of another, or they improve performance in some cases
while hurting it in others.
The way that GNU
diff determines which lines have changed always comes up with a near-minimal set
of differences. Usually it is good enough for practical purposes. If the diff output is large, you might want diff to use a modified algorithm that sometimes produces a smaller set of
differences. The '-d' or '--minimal' option does this; however, it can also cause diff to run more slowly than usual, so it is not the default behavior.
When the files you are comparing are large and have small groups of changes
scattered throughout them, use the '
-H' or '--speed-large-files' option to make a different modification to the algorithm that diff uses. If the input files have a constant small density of changes, this
option speeds up the comparisons without changing the output. If not, diff might produce a larger set of differences; however, the output will still be
correct.
Normally
diff discards the prefix and suffix that is common to both files before it
attempts to find a minimal set of differences. This makes diff run faster, but occasionally it may produce non-minimal output.
The '
--horizon-lines= lines' option prevents diff from discarding the last lines lines of the prefix and the first lines lines of the suffix. This gives diff further opportunities to find a minimal output.