[ANN] -- Program to remove duplicate lines from large files

  • Thread starter Thread starter Dewey Edwards
  • Start date Start date
D

Dewey Edwards

Hi,

About two months ago JohnF asked if anyone knew of a program to remove
duplicate lines from 100K+ line text files. I shot my mouth off, that
it would be a piece of cake to write.

Maybe it WASN'T that easy, but it's now done.

home.nycap.rr.com/dewed/rm_dup_lines_v0.1.zip

about 24K

Program is freeware, actually greenware too. No install and no
registry entries added by the program. Current version requires it to
be run in a DOS box, however, there is no 640K limitations. The
program has run successfully on a 20+ MEG text file. Run time was
under 5 seconds.

The program neither sorts the input file, nor stores it internally.
It does, however, save line "differences" rather than the lines
themselves in a TRIE data structure. See the included README file for
more information.

Hope it helps somebody,

dewey
 
Back
Top