D
Dewey Edwards
Hi,
About two months ago JohnF asked if anyone knew of a program to remove
duplicate lines from 100K+ line text files. I shot my mouth off, that
it would be a piece of cake to write.
Maybe it WASN'T that easy, but it's now done.
home.nycap.rr.com/dewed/rm_dup_lines_v0.1.zip
about 24K
Program is freeware, actually greenware too. No install and no
registry entries added by the program. Current version requires it to
be run in a DOS box, however, there is no 640K limitations. The
program has run successfully on a 20+ MEG text file. Run time was
under 5 seconds.
The program neither sorts the input file, nor stores it internally.
It does, however, save line "differences" rather than the lines
themselves in a TRIE data structure. See the included README file for
more information.
Hope it helps somebody,
dewey
About two months ago JohnF asked if anyone knew of a program to remove
duplicate lines from 100K+ line text files. I shot my mouth off, that
it would be a piece of cake to write.
Maybe it WASN'T that easy, but it's now done.
home.nycap.rr.com/dewed/rm_dup_lines_v0.1.zip
about 24K
Program is freeware, actually greenware too. No install and no
registry entries added by the program. Current version requires it to
be run in a DOS box, however, there is no 640K limitations. The
program has run successfully on a 20+ MEG text file. Run time was
under 5 seconds.
The program neither sorts the input file, nor stores it internally.
It does, however, save line "differences" rather than the lines
themselves in a TRIE data structure. See the included README file for
more information.
Hope it helps somebody,
dewey