File IO Optimization

Bill Pierce · Nov 28, 2006

Is it faster to read through a file, without checking for end of
stream, and catch the EndOfStreamException or to do a check for
position vs. length after reading each line of a file? This is using a
BinaryReader.

I am going to setup a some performance tests but wanted to gather any
input from the learned groopies.

Tom Leylan · Nov 28, 2006

Hi Bill:

I'm going to take a guess and say "it depends" :-)

If it is a large file
you'll surely gain speed by not checking after reading each line but will
take a hit when the EndOfStream is reached. Even if the time to handle that
exception is significant it will hardly affect the overall time if the file
is large. It would be a significant part of the total time (it seems) on
short files which would spend most of their time processing the exception.

I think we'd all like to see the results of your tests :-)

Gadget · Nov 28, 2006

Is it faster to read through a file, without checking for end of
stream, and catch the EndOfStreamException or to do a check for
position vs. length after reading each line of a file? This is using a
BinaryReader.

I am going to setup a some performance tests but wanted to gather any
input from the learned groopies.

Exceptions are slow, so checking for EOF is certainly going to be faster.
Internally the framework has to check for EOF on every read statement, and
as it's the file-system that provides the length of the file (not the
data), the framework already knows how long the file is, so it's a simple
maths operation internally.

Cheers,
Gadget

Jon Skeet [C# MVP] · Nov 28, 2006

Bill Pierce said:
Is it faster to read through a file, without checking for end of
stream, and catch the EndOfStreamException or to do a check for
position vs. length after reading each line of a file? This is using a
BinaryReader.

I am going to setup a some performance tests but wanted to gather any
input from the learned groopies.

How are you reading the data? Do you actually need to use BinaryReader
rather than just a Stream? If not, just call Stream.Read repeatedly
until the return value is 0.

Bill Pierce · Dec 1, 2006

Bill said:
Is it faster to read through a file, without checking for end of
stream, and catch the EndOfStreamException or to do a check for
position vs. length after reading each line of a file? This is using a
BinaryReader.

I am going to setup a some performance tests but wanted to gather any
input from the learned groopies.

Results of my testing didn't seem very conclusive. I might have gone
about it the wrong way but anyways...
All depends on how many reads you do to the file. It appears that
<5000 reads, it is faster to check position/length. >5000 reads, it is
faster to catch the EndOfStream exception

Here is the code I used for the tests, using files of varying length,
averaging multiple reads of the file. The file being read is just a
bunch of sequential uints written using a binary reader.

private static double ReadWithPositionCheck(string fileName)
{
long start = 0, end = 0;

using(Stream stream = File.Open(fileName, FileMode.Open,
FileAccess.Read, FileShare.Read))
{
using(BinaryReader reader = new BinaryReader(stream))
{
QueryPerformanceCounter(out start);

long length = reader.BaseStream.Length;
long readLength = length - 4;
long position = 0;

while(position < readLength)
{
reader.ReadUInt32();
position += 4;
}

QueryPerformanceCounter(out end);
}
}

return (double)(end - start) / Frequency;
}

private static double ReadWithoutPositionCheck(string fileName)
{
long start = 0, end = 0;

using(Stream stream = File.Open(fileName, FileMode.Open,
FileAccess.Read, FileShare.Read))
{
using(BinaryReader reader = new BinaryReader(stream))
{
QueryPerformanceCounter(out start);

try
{
while(true)
{
reader.ReadUInt32();
}
}
catch(EndOfStreamException)
{
}

QueryPerformanceCounter(out end);
}
}

return (double)(end - start) / Frequency;
}

File IO Optimization

Bill Pierce

Tom Leylan

Gadget

Jon Skeet [C# MVP]

Bill Pierce