Idea for Reader optimisation for reading

  • Thread starter Thread starter Guest
  • Start date Start date
G

Guest

Hi,

I was using the Reader class to read a binary file and I noticed that
there is no PeekByte and also its not optimised when using Peek and Read in
sequence.

If the .Peek is called its read the next char, and then .Read after its
reading twice, why cant the .Read use the stored Peek internally? It would
be twice as fast.
 
Hi,

Thanks for your post. As I understand, you have some performance concerns
about the Reader class. I'd like to share the following information with
you:

What's the Reader class you are using?

Generally speaking, the Peek method of Reader classes (say, StreamReader,
etc) will return the next available character. That is, it should be of the
same with the PeekByte you suggested.

In addition, please kindly note that the FileStream has buffering
internally to improve read and write performance. A buffer is a block of
bytes in memory used to cache data, thereby reducing the number of calls to
the operating system. A call to .Read after a .Peek call will get data from
the cache, so I believe there is no need to worry about the performance.

Please feel free to let me know if you have any problems or concerns.

Regards,

HuangTM
Microsoft Online Partner Support
MCSE/MCSD

Get Secure! -- www.microsoft.com/security
This posting is provided "as is" with no warranties and confers no rights.
 
BinaryReader has no .PeekByte yet its a Binary class, why not?

This means we CANNOT use the following method to read files..


byte b;
while (br.PeekChar() != -1)
{
b = br.ReadByte();
}

The above will fail on certain file contents that are not "chars".

Now we have to do..

try
{
byte b;
while (true)
{
b = br.ReadByte();
}
}

catch
{
}

In a Binary class with ReadByte yet no PeekByte seems strange and an
oversight, maybe Im missing something from the picture but its seemed
logical to me to have it there as alot of people seem to code the first
version with .PeekChar and that is doomed to fail on some Binary files yet
its the BinaryReader.

Just as a sidenote, why cant the .PeekChar method buffer this char so the
next .ReadChar wont have to re-read the file again. I just noticed a 2x
performance when removing this double reading of the file (.Peek then .Read
operations).
 
Using an Exception to signal EOF is BAD. Its expensive and shoddy coding.
Why do you encourage this in the BinaryReader and other libraries??

Using a .Peek is better for detecting no more bytes to read yet you dont
have this functionality, you decide to throw an expensive exception, way to
go. Not only that a Peek followed by a Read is 2x the file dipping wheras it
should only be once. Why is this and what genius dreamt up this hoopla?.
 
Hello,

Thanks for your feedback. I understand your concerns, and now I'd like to
share the following information with you:

1. Did you specify the encoding when creating the BinaryReader? If you did
not specify, it will use UTF8Encoding and I believe that may be the reason
why PeekChar may return -1 on some characters. I suggest you to use ASCII
encoding as the following to check if it works:

BinaryReader r = new BinaryReader(fs, System.Text.Encoding.ASCII);

2. In addition, I do not recommend you use PeekChar to check the EOF
because it also returns -1 when the underlying stream does not support
seeking. Alternatively, I suggest that you can get the length of file and
then get the bytes in a loop. Please refer to the following code snippet:

//--------------------code snippet--------------------
FileInfo fi=new FileInfo(FILE_NAME);
FileStream fs = fi.OpenRead();
BinaryReader r = new BinaryReader(fs);

// Read data
for(int n =0; n<fi.Length; n++)
{
Console.WriteLine(r.ReadByte());
}
......
//--------------------------end of----------------------

I am standing by for your feedback.

Have a nice day!

Regards,

HuangTM
Microsoft Online Partner Support
MCSE/MCSD

Get Secure! -- www.microsoft.com/security
This posting is provided "as is" with no warranties and confers no rights.
 
I used FileStream.BaseStream.Position and FileStream.BaseStream.Length to
determine if I reach EOF or not.

But I read that FileStream.Read is not guranteed to read all my initial
pos -> count at once
 
I used FileStream.BaseStream.Position and FileStream.BaseStream.Length to
determine if I reach EOF or not.

I presume you mean BinaryReader.BaseStream.Position/Length? FileStream
doesn't have a BaseStream property.

If you *do* mean BinaryReader.BaseStream, you shouldn't use that for
detecting EOF. The BinaryReader may buffer some data, so even if the
FileStream is at the end, there may be more data to be read from the
BinaryReader.
 
Back
Top