Sorting text files - how ?

  • Thread starter Thread starter John Fitzsimons
  • Start date Start date
You are not sorting from the R.H.S. This can be described in several
ways but you apparently want to find the last(!) period in the line
and sort on the remainder of the line (primary sort)

No, I want to sort from the last character of a line. Backwards.
and from the
beginning of the line to the aforementioned period (secondary sort).

No, I don't want that either. If I sorted that way then a sorted....

348: : 9/Aug/04 01:40: /~johnf/spirits.htm
332: 0.01%: 9/Aug/04 02:02: /~johnf/welcome.htm
392: 0.02%: 9/Aug/04 02:47: /~johnf/chspirit.htm

would result in....

332: 0.01%: 9/Aug/04 02:02: /~johnf/welcome.htm
348: : 9/Aug/04 01:40: /~johnf/spirits.htm
392: 0.02%: 9/Aug/04 02:47: /~johnf/chspirit.htm

The opposite of what I want.

Sorting the first, above, should get me....

392: 0.02%: 9/Aug/04 02:47: /~johnf/chspirit.htm
348: : 9/Aug/04 01:40: /~johnf/spirits.htm
332: 0.01%: 9/Aug/04 02:02: /~johnf/welcome.htm

If I get what I want.

John, do you have your answer or do you need a package?

Still investigating. :-)

Regards, John.
 
If you just want to group by extension and don't insist in the correct
sorting order you can do a char-wise sort from the right hand side with:

Maybe you can ask the author to improve the program to your needs? At
least it could be a challenge and would make his program even more
special. ;-)

Thanks. Interesting program. :-)

It looks like it will put the extensions together but not put what is
before them in order. Part of the problem with a potential solution is
that in some cases there are items missing. See the third line and the
second space before the : .

940: 0.01%: 9/Aug/04 02:47: /~johnf/web15.jpg
359: 0.01%: 9/Aug/04 02:02: /~johnf/knight.gif
332: : 9/Aug/04 01:40: /~johnf/spirits.htm
328: 0.01%: 9/Aug/04 02:02: /~johnf/welcome.htm
322: 0.02%: 9/Aug/04 02:47: /~johnf/chspirit.htm
299: 0.01%: 9/Aug/04 02:02: /~johnf/world2.jpg
295: : 9/Aug/04 02:02: /~johnf/blue.jpg

Making it a c.s.v. list would not IMO be easy. Another option I
thought of was to do it in three stages :

(A) Reverse the lines eg. with the first line a "reversal" would be ;

gpj.51bew/fnhoj~/:74 etc. etc.

(B) "Sort" in any text sorting program.

(C) Reverse the results obtained in (B).

Now, does anyone know a program/utility that will do (A) ? Make a text
line with abcd change to dcba please ?

Regards, John.
 
John Fitzsimons wrote ....

| Now, does anyone know a program/utility
| that will do (A) ? Make a text line
| with abcd change to dcba please ?

John ....

Perhaps a bit of Python could be useful
for reversing your strings ....

def string_reverse( string_in ) :

this_list = [ ]

for this_char in string_in :

this_list.append( this_char )

this_list.reverse()

string_out = ''.join( this_list )

return string_out


list_strings = [ 'alt.comp.freeware' ,
'Cousin Stanley' ,
'Python' ,
'John Fitzsimons' ]

print

for this_string in list_strings :

str_rev = string_reverse( this_string )

print ' ' , str_rev

print
 
Another option I
thought of was to do it in three stages :

(A) Reverse the lines eg. with the first line a "reversal" would be ;

gpj.51bew/fnhoj~/:74 etc. etc.

(B) "Sort" in any text sorting program.

(C) Reverse the results obtained in (B).

Now, does anyone know a program/utility that will do (A) ? Make a text
line with abcd change to dcba please ?

Regards, John.

Filter40 will reverse-write or mirror lines of text.
Simple syntax.

Description:
http://short.stop.home.att.net/freesoft/txtfrmt1.htm
Download:
http://www.xs4all.nl/~ferguson/freeware/filter40.zip
 
with:

Thanks. Interesting program. :-)

It looks like it will put the extensions together but not put what is
before them in order. Part of the problem with a potential solution is
that in some cases there are items missing. See the third line and the
second space before the : .

940: 0.01%: 9/Aug/04 02:47: /~johnf/web15.jpg
359: 0.01%: 9/Aug/04 02:02: /~johnf/knight.gif
332: : 9/Aug/04 01:40: /~johnf/spirits.htm
328: 0.01%: 9/Aug/04 02:02: /~johnf/welcome.htm
322: 0.02%: 9/Aug/04 02:47: /~johnf/chspirit.htm
299: 0.01%: 9/Aug/04 02:02: /~johnf/world2.jpg
295: : 9/Aug/04 02:02: /~johnf/blue.jpg

Making it a c.s.v. list would not IMO be easy. Another option I
thought of was to do it in three stages :

(A) Reverse the lines eg. with the first line a "reversal" would be ;

gpj.51bew/fnhoj~/:74 etc. etc.

(B) "Sort" in any text sorting program.

(C) Reverse the results obtained in (B).

Now, does anyone know a program/utility that will do (A) ? Make a text
line with abcd change to dcba please ?

Regards, John.

I couldn't resist playing further with this.
If all the files in your list have 3-digit extensions, you can sort by
file type, and still sort alphabetically by filename within type.

If LIST.TXT containing this list:
apples.png
peaches.jpg
pumpkin.gif
pie.bmp
parsley.gif
sage.bmp
rosemary.png
thyme.gif

FILTER.EXE <list.txt >rev.txt /V
produces

REV.TXT
gnp.selppa
gpj.sehcaep
fig.nikpmup
pmb.eip
fig.yelsrap
pmb.egas
gnp.yramesor
fig.emyht

Now re-reverse only the first 3 characters
FILTER.EXE <rev.txt >rev2.txt /V3,0
produces

REV2.TXT
png.selppa
jpg.sehcaep
gif.nikpmup
bmp.eip
gif.yelsrap
bmp.egas
png.yramesor
gif.emyht

Now column-select the file extensions (including the dot), copy
and paste them into the *original* file LIST.TXT at the left margin

png.apples.png
jpg.peaches.jpg
gif.pumpkin.gif
bmp.pie.bmp
gif.parsley.gif
bmp.sage.bmp
png.rosemary.png
gif.thyme.gif

Sort the result alphabetically

bmp.pie.bmp
bmp.sage.bmp
gif.parsley.gif
gif.pumpkin.gif
gif.thyme.gif
jpg.peaches.jpg
png.apples.png
png.rosemary.png

Then column-strip the left-hand extensions

pie.bmp
sage.bmp
parsley.gif
pumpkin.gif
thyme.gif
peaches.jpg
apples.png
rosemary.png

You now have a list which groups by file extension (bitmaps have floated
to the top, zips have sunk to the bottom), which is still alphabetical by
filename *within each group*

BTW, I used EditPad for column selection.
 
John Fitzsimons said:
Making it a c.s.v. list would not IMO be easy. Another option I
thought of was to do it in three stages :

(A) Reverse the lines eg. with the first line a "reversal" would be ;

gpj.51bew/fnhoj~/:74 etc. etc.

(B) "Sort" in any text sorting program.

(C) Reverse the results obtained in (B).

Now, does anyone know a program/utility that will do (A) ? Make a text
line with abcd change to dcba please ?

Regards, John.

I would use MS javascript - requires Windows Scripting Host 5.6

Read the file into memory and use a regular expression with a sort function
to sort by extension , filename then path.

Alternatively it should be also be a doddle to use replace with a regular expression
to create a C.S.V file in a format that could be sorted in a spreadsheet.

I've got an old script I could adapt to do either if that suits?

Regards.

Mel.
 
Now, does anyone know a program/utility that will do (A) ? Make a text
line with abcd change to dcba please ?

Sure. There are many pipe tools. One coming to mind is "filter" by Bob
Ferguson:

http://hello.to/ferguson

Look under DOS-Filters. Command line for your problem would be:

FILTER.EXE < test.txt > reverse.txt /v

What amazes me is your attempt to achieve less then provided to you up
to now. You've seen responses regarding some Unix based tools which
should lead to complete success. And my suggestion using the LMod filter
does very well, too. Remember this:

Message-ID: <[email protected]>

If you really want to avoid command line you should place the mentioned
LMod command within a batch file. Replace test.txt with %1 and insert
a second line to open result.txt with favorite text editor. If you now
include a menu entry for this batch file within the context menu of
txt-files everything works automatic. You even can replace result.txt
by test.txt because the suggested LMod filter works near to save. So
you can overwrite the original. Even consecutive calls of the batch do
nothing worse. (At least regarding to anything bad possibly done with
the first call. - I don't dare asserting clear run of my pipe suggestion
under *all* possible circumstances...)

BeAr
 
I would use MS javascript - requires Windows Scripting Host 5.6

The following (wholly inadequately tested) script seems to work,
it produced this ouput with your test file:-


9/Aug/04 02:02: /~johnf/knight.gif
9/Aug/04 02:47: /~johnf/chspirit.htm
9/Aug/04 02:20: /~johnf/ouija.htm
9/Aug/04 01:40: /~johnf/spirits.htm
9/Aug/04 02:02: /~johnf/welcome.htm
9/Aug/04 02:02: /~johnf/blue.jpg
9/Aug/04 02:47: /~johnf/web15.jpg
9/Aug/04 02:02: /~johnf/world2.jpg

If you have WScript associated with .JS files
dropping a file on the script's icon should sort it and
save it as ~sorted.tmp in the scripts folder then
open it in notepad so you can save it with a suitable
name.

Should you choose to try it please check it carefully
in case the newsreaders break lines or mangle it -
(I notice OE thinks my regular expressions are file: links)


//cscript testsort.js source dest or drop file on script icon
//if drag + drop used, saved as ~sorted.tmp in script's dir
//dest WILL be overwritten!!

var ReadOnly = 1;
var Ascii = 0;
var ForWriting =2;

var rx = new RegExp;

// (path)(file )(.ext )
var regx = " /(.*?/)([^/]+?)(\\.\\w+)?$";

rx.compile(regx);


var fso = new ActiveXObject("Scripting.FileSystemObject");
var mydest = "";
var mysource="";

switch(WScript.Arguments.Unnamed.length)
{

case 2 :
mydest = WScript.Arguments.Unnamed(1);

case 1 :

mysource = WScript.Arguments.Unnamed(0);
case 0 :

if( mysource == "")
{
alert(WScript.ScriptName + " sourcefile destfile");
WScript.Quit();
}

if (!fso.FileExists(mysource))
{
alert(mysource + " not found");
WScript.Quit();
}

}

if (mydest == "")
{
var scriptpath = fso.GetParentFolderName(WScript.ScriptFullName);
mydest= fso.BuildPath(scriptpath,"~sorted.tmp");
}


var f =fso.GetFile(mysource);

fs = f.OpenAsTextStream(ReadOnly,Ascii)

var text = fs.ReadAll();
fs.Close();

var rxsplit = new RegExp;
rxsplit.compile("\\s*\\n"); //match end of lines

text = text.split(rxsplit); //convert to array

text.sort(sortoffilenames);

text = text.join("\r\n"); //convert back to string

//save file

fs = fso.OpenTextFile(mydest,ForWriting,true)
fs.write(text);
fs.Close();

// open in notepad

var WshShell = new ActiveXObject("WScript.Shell");

WshShell.Run("notepad" + ' "' + mydest + '"', 1, false);


// sort function 1 == a > b, 0 == a=b, -1 == a < b

function sortoffilenames(a,b)
{
var aa= rx.exec(a.toLowerCase());
var bb= rx.exec(b.toLowerCase());

if (aa == null)
{
if(bb == null) return(0);
else return(1);
}

else if (bb == null) return(-1);

if (aa[3] > bb[3]) return(1); //extension
if (aa[3] < bb[3]) return(-1);

if (aa[2] > bb[2]) return(1); //filename
if (aa[2] < bb[2]) return(-1);

if (aa[1] > bb[1]) return(1); //should be path
if (aa[1] < bb[1]) return(-1);

return(0);
}


function alert(message)
{
WScript.echo(message);
}
 
Back
Top