Editor to sort large ascii files by column?

  • Thread starter Thread starter JeB
  • Start date Start date
J

JeB

Greetings all -

I'm searching for an editor that can sort a large file of one line
entries as below, using only the column on the left. These are log file
entries and the number is a date code if you're curious.

A desirable second feature (wish list) would be the ability to select
lines like the first one with the 0.0.0.0 and * and then delete them,
en masse. (There will be hundreds of them)

17864553 0.0.0.0 *
17816501 158.103.0.1 campus-pat-get-port.morgan.edu
17816501 203.168.0.2 customer-203-168-0-2.ph.inter.net
 
JeB said:
Greetings all -

I'm searching for an editor that can sort a large file of one line
entries as below, using only the column on the left. These are log file
entries and the number is a date code if you're curious.

A desirable second feature (wish list) would be the ability to select
lines like the first one with the 0.0.0.0 and * and then delete them,
en masse. (There will be hundreds of them)

17864553 0.0.0.0 *
17816501 158.103.0.1 campus-pat-get-port.morgan.edu
17816501 203.168.0.2 customer-203-168-0-2.ph.inter.net

http://www.xoology.com/concertox/xool/home/products/coda.html

This product may provide the tool(s) you need. It has features to allow
for selected editing of various formats with a built in search feature.

http://www.xoology.com/concertox/xool/home/products/download.html

The product does have a search select all and then delete feature after
the search/find process creates the data reference in a new window.
I am learning to use the product for another purpose but it may very
well be the one you need.

Hope this helps.

SLP
******
 
Greetings all -
I'm searching for an editor that can sort a large file of one line
entries as below, using only the column on the left. These are log file
entries and the number is a date code if you're curious.
A desirable second feature (wish list) would be the ability to select
lines like the first one with the 0.0.0.0 and * and then delete them,
en masse. (There will be hundreds of them)
17864553 0.0.0.0 *
17816501 158.103.0.1 campus-pat-get-port.morgan.edu
17816501 203.168.0.2 customer-203-168-0-2.ph.inter.net

All editors that I have used cue on the first column. They will
further sort by IP and then by the name of the IP. If you only want to
sort on the the first column this might be a problem.

For large files (100 meg) EditPad (Lite or Classic) is the smallest
program of the highest power I've tried. There are 4 and 2 registry
entries respectively. Cetus WordPad is good also.

More info, editors & links:

http://www.woundedmoon.org/PL/text1rev02.php
 
Greetings all -

I'm searching for an editor that can sort a large file of one line
entries as below, using only the column on the left. These are log
file entries and the number is a date code if you're curious.

Hmm...if the datecode is always at the beginning of the line, or
the datecode is always the same number of digits, then I'd suggest
using PSPad from http://www.pspad.com/index_en.html . It can sort on
the entire line, and also on columns (although the columns sorting
seems dependent upon the position on a line, as opposed to field
separattors.
A desirable second feature (wish list) would be the ability to
select lines like the first one with the 0.0.0.0 and * and then
delete them, en masse. (There will be hundreds of them)

17864553 0.0.0.0 *
17816501 158.103.0.1 campus-pat-get-port.morgan.edu
17816501 203.168.0.2 customer-203-168-0-2.ph.inter.net

You can do that with regular expressions in PSPad. Do a search for

^.* 0\.0\.0\.0 \*$

and leave the replace with box blank; that should replace all
instances of lines with 0.0.0.0 with blank lines. You can then use
the sort feature to sort, and also remove duplicate lines, which will
get rid of the duplicate blank lines.
 
JeB said:
I'm searching for an editor that can sort a large file of one line
entries as below, using only the column on the left. These are log file
entries and the number is a date code if you're curious.
Emacs can do so.
A desirable second feature (wish list) would be the ability to select
lines like the first one with the 0.0.0.0 and * and then delete them,
en masse. (There will be hundreds of them)
Emacs can do so as well.

Ciao,
Bernd
 
Greetings all -

I'm searching for an editor that can sort a large file of one line
entries as below, using only the column on the left. These are log file
entries and the number is a date code if you're curious.

A desirable second feature (wish list) would be the ability to select
lines like the first one with the 0.0.0.0 and * and then delete them,
en masse. (There will be hundreds of them)

17864553 0.0.0.0 *
17816501 158.103.0.1 campus-pat-get-port.morgan.edu
17816501 203.168.0.2 customer-203-168-0-2.ph.inter.net


you could do this from DOS.

to sort:
sort myfile.log > sorted.log

to remove unwanted lines, use find to search for lines that do not contain
the string you want. if you want to remove ' 0.0.0.0 *' then:

find /v ' 0.0.0.0 *' myfile.log > stripped.log

....and for huge files i expect that'll be as quick as you're going to get.


jack
tweening polymorphically
 
Greetings all -

I'm searching for an editor that can sort a large file of one line
entries as below, using only the column on the left. These are log file
entries and the number is a date code if you're curious.

thanx to all who replied, I'll explore the suggestions.
 
You can do that with regular expressions in PSPad. Do a search for
^.* 0\.0\.0\.0 \*$
and leave the replace with box blank; that should replace all
instances of lines with 0.0.0.0 with blank lines. You can then use
the sort feature to sort, and also remove duplicate lines, which will
get rid of the duplicate blank lines.

Wow ! I love that solution. Thanks. :-)

Now, if one could do a regex search of all lines that do NOT have that
in (see Jack's post) then one could do things in a single step and
save the results to a second file.

Would be interesting to know IF one can do a "does not contain" regex
search, in what editors, and with what syntax. Can anyone answer
any/all of those three questions ?

Regards, John.
 
Wow ! I love that solution. Thanks. :-)

No problem.
Now, if one could do a regex search of all lines that do NOT have
that in (see Jack's post) then one could do things in a single
step and save the results to a second file.

Would be interesting to know IF one can do a "does not contain"
regex search, in what editors, and with what syntax. Can anyone
answer any/all of those three questions ?

I don't see that feature listed anywhere in PSPad, and unfortunately
the macro feature of PSPad doesn't seem powerful enough to do it.
Also, I don't know enough about other editors to comment. I might do
a little looking, though, just to satisfy my own curiousity. However,
I really do like PSPad, and I can't imagine finding any other editors
that might replace it.
 
Would be interesting to know IF one can do a "does not contain" regex
search, in what editors, and with what syntax. Can anyone answer
any/all of those three questions ?

you have it now john, in emacs: 'Delete, All lines not containing'

jack
we don't believe in the if and maybe scenario
 
On Wed, 31 Dec 2003 10:27:54 +1100, John Fitzsimons wrote:
you have it now john, in emacs: 'Delete, All lines not containing'

Isn't that a function you were good enough to make specifically for
me ? I was wondering whether this sort of thing came as "standard"
with any text editors. Why ? Because I often recommend software to
others.

Many thanks however for the reminder. :-)

Regards, John.
 
Isn't that a function you were good enough to make specifically for
me ? I was wondering whether this sort of thing came as "standard"
with any text editors. Why ? Because I often recommend software to
others.


well, strictly speaking i made it specifically for me some years ago...and
then selected it as one you might like :)

i don't think i've ever seen it elsewhere.

jack
But it's true, even if it didn't happen
 
I missed the beginning of the thread. But regarding the subject:

Try Context:

http://fixedsys.com/context/

--- some fragments ---
ConTEXT is a small, fast and powerful text editor, developed mainly to
serve as secondary tool for software developers. This editor is
freeware.

Features:
* unlimited open files
* unlimited editing file size, 4kB line length
* text sort
* normal and columnar text selection
--- some fragments ---

Cu,
Alex
 
'Delete, All lines not containing'
well, strictly speaking i made it specifically for me some years ago...and
then selected it as one you might like :)
i don't think i've ever seen it elsewhere.

Don't think I have either. That's why I asked the question. :-)

Also, your superb programming skills, and willingness to share such
functionality, are examples of the very best of contributors to this
newsgroup. Or more properly any newsgroup anyware.

The same applies to other programmers such as Dewey, Cousin
Stanley etc.

Regards, John.
 
Back
Top