Reading text file, bottom up?

  • Thread starter Thread starter Guest
  • Start date Start date
G

Guest

I have tab delimited text file which gets populated on daily basis via
automated process. New entry is written at the bottom. I need to create a
utility which makes a copy of this file with 10 most recent entries.

When I read line using StreamReader object it starts from the top, so
looping though lines and keeping track of the line is not helpful. Is there
anyway to start reading from the bottom or if anyone could suggest some other
way???

Thanks
 
Job Lot said:
I have tab delimited text file which gets populated on daily basis via
automated process. New entry is written at the bottom. I need to create a
utility which makes a copy of this file with 10 most recent entries.

When I read line using StreamReader object it starts from the top, so
looping though lines and keeping track of the line is not helpful. Is there
anyway to start reading from the bottom or if anyone could suggest some other
way???

Reading from the end of a file is very difficult in a general case,
partly because you'd have to keep either seeking backwards for every
byte you read, or you'd have to keep a buffer and read (say) the last
10 bytes, then seek to the start of the 10 bytes before that, etc.

You then have the added problem of character encodings - it's not so
bad if you're using a fixed-width character encoding, but with
something like UTF-8, you have to work out where the start of the
previous character is, etc.
 
Job,

Just an idea,

Would reading it in an arraylist using insert(0,line), while you remove when
that exist everytime row 10 be an idea?

Cor
 
Why not read every line into a stack. Then when the file is done, the
last ten items will be at the top of the stack. Just pop 10 times and
you're done.
 
¤ I have tab delimited text file which gets populated on daily basis via
¤ automated process. New entry is written at the bottom. I need to create a
¤ utility which makes a copy of this file with 10 most recent entries.
¤
¤ When I read line using StreamReader object it starts from the top, so
¤ looping though lines and keeping track of the line is not helpful. Is there
¤ anyway to start reading from the bottom or if anyone could suggest some other
¤ way???

Is there a column in the file which corresponds to the order in which the entries are written (such
as a date or sequence number)?


Paul ~~~ (e-mail address removed)
Microsoft MVP (Visual Basic)
 
Thanks everyone.

Paul, there is no such column which corresponds to the order.

Patrick A, reading to stack seems to fit the bill. But I really don’t know
how to read line to stack?
 
It doesn't actually. The code displayed in the blog is out of date.

Here's the code I use, which seems to work but may not be perfect:

public void FindLastPosition()
{
FileStream stream = new FileStream(filename, FileMode.Open,
FileAccess.Read, FileShare.ReadWrite);
int charSize = System.Text.Encoding.Default.GetByteCount(" ");
int count = 0;
stream.Seek(0, SeekOrigin.End);
long pos = stream.Position;
while (pos>0 && count<lines)
{
pos -= charSize;
stream.Seek(pos, SeekOrigin.Begin);
if ((char)stream.ReadByte()=='\r') count++;
}
stream.Seek(pos<1 ? 0 : pos, SeekOrigin.Begin);

filePos = stream.Position;
stream.Close();
}

Hope that helps.
 
It doesn't actually. The code displayed in the blog is out of date.

Then "it" does - where "it" is the code Nick posted a link to!
Here's the code I use, which seems to work but may not be perfect:

That'll work on ASCII-like files with a single byte per character and a
line terminator which is or starts with carriage return. It won't work
reliably on any other type of file:

1) It assumes the file is in Encoding.Default
2) It assumes a fixed character size for seeking, which is invalid for
encodings such as UTF-8
3) It assumes a single byte per character when reading a byte, and
then assumes that the file is effectively ASCII by assuming that
a byte of 13 means carriage return. There are many characters in
Encoding.Unicode (for example) which have 13 as one of their bytes
4) It assumes that '\r' is part of the line terminator, which won't
be true for a lot of files generated on Unix

There are ways to fix some of these problems (and to make it perform
better by reading into a buffer rather than seeking for each byte) but
it's far from trivial given encodings like UTF-8.
 
Well sorry, "it" was actually the download on the page I thought!

Anyway you do make some very valid points, but it should work ok for ascii.

Regarding the "reading into a buffer instead", it seems to perform quite
well because the stream seems to be buffered anyway (assuming it's not
forward only buffering).

I'd be interested in finding out how the "real" tail handles different
encodings...
 
John Wood said:
Well sorry, "it" was actually the download on the page I thought!

Unfortunately it's not clear from the page that the download will be
any different from the code which is prominently displayed to start
with. Could I suggest that you either change the blog entry to have the
new code, or add something (at the top, in bold) to make it clear it's
not the most recent version?
Anyway you do make some very valid points, but it should work ok for ascii.
True.

Regarding the "reading into a buffer instead", it seems to perform quite
well because the stream seems to be buffered anyway (assuming it's not
forward only buffering).

I'd be interested in finding out how the "real" tail handles different
encodings...

I have a sneaking suspicion it might not - or at least, older versions
wouldn't. Given enough work, most character encodings could be
supported, but you need more knowledge of them than the .NET framework
gives you. (For instance, you very fairly easily tell when you've got
to the first byte of a UTF-8 sequence, but you'd need to specifically
know that you're dealing with UTF-8 - the Encoding doesn't help you.)
 
This block outputs the last ten lines in a www log file:
Stack s = new Stack();
string filename =
@"C:\WINDOWS\system32\Logfiles\W3SVC1\ex050216.log";

// Create an instance of StreamReader to read from a file.
// The using statement also closes the StreamReader.
using (StreamReader sr = new StreamReader(filename))
{
String line;
while ((line = sr.ReadLine()) != null)
{
s.Push(line);
}
}

string output;
for (int i = 0; i < 10; i++)
{
output = (string)s.Pop();
Debug.WriteLine(output);

}
 
Patrick A said:
This block outputs the last ten lines in a www log file:

<snip>

Yes, but again, it requires reading in the whole file, which I believe
the OP doesn't want to do. (If you're going to do that, using an
ArrayList would get a more flexible collection at the end of it - it's
easy to get the last 10 lines from an ArrayList, but it's hard to get
random access to a Stack, which would often be useful.)
 
just a thought but have you considered using an array as mentioned by Jon
Skeet?
i dont understand his assumption of not wanting to read the whole file as
you've made no mention of said requirement.

just a suggestion but if you have Visual Basic.net you might try this:

Option Explicit On
Option Strict On
Imports System.io
Module Module1
Public Sub main()
Dim lines As String()
Try
Dim sr As StreamReader = New StreamReader("yourfilehere.log",
System.Text.Encoding.ASCII) ' assuming ascii
Dim data As String
data = Replace(sr.ReadToEnd(), ControlChars.CrLf,
ControlChars.Cr) 'assuming the tab delimited lines terminated by crlf
lines = data.Split(CChar(ControlChars.Cr))
sr.Close()
Catch E As Exception
' Let the user know what went wrong.
Console.WriteLine("The log file could not be read:")
Console.WriteLine(E.Message)
Exit Sub
End Try
Try
Dim sw As StreamWriter = New StreamWriter("copy of
yourfilenamehere.log", False, System.Text.Encoding.ASCII) 'assuming ascii
If UBound(lines, 1) >= 10 Then
Dim i As Integer
For i = 1 To 10
sw.WriteLine(lines(UBound(lines, 1) - i))
Next
Else
Dim i As Integer
For i = 1 To UBound(lines, 1)
sw.WriteLine(lines(UBound(lines, 1) - i))
Next
End If
sw.Close()
Catch E As Exception
Console.WriteLine("Something broke O_o?")
Console.WriteLine(E.Message)
End Try
End Sub

End Module
 
Back
Top