what is the best way to read text file

  • Thread starter Thread starter Guoqi Zheng
  • Start date Start date
G

Guoqi Zheng

Dear sir/lady,

I need to process our web server log file every day. Normally, this file is about 50MB - 100s MB big. How can I read the file in a effective way.

I can open a streamreader to read it and use readline() function to read line by line and process them. However, this sounds very slow. I can also use readtoend to read everything into a string, but then I am afraid that it will be too big for memory.

How should I read the file?
 
Hi Guoqi Zheng,

A third solution might be to use Stream.Read and read a chunk of bytes.
Process the chunk and read some more. ...

Anyway, StreamReader.ReadLine might not be so slow either.
 
Why do you need to read the file? Are you preparing an application to parse
log files?


--
Regards,
Alvin Bruney [Microsoft MVP ASP.NET]

[Shameless Author plug]
The Microsoft Office Web Components Black Book with .NET
Now Available @ http://www.lulu.com/owc
 
¤ Dear sir/lady,
¤
¤ I need to process our web server log file every day. Normally, this file is about 50MB - 100s MB big. How can I read the file in a effective way.
¤
¤ I can open a streamreader to read it and use readline() function to read line by line and process them. However, this sounds very slow. I can also use readtoend to read everything into a string, but then I am afraid that it will be too big for memory.
¤
¤ How should I read the file?

Don't think I've ever seen the contents of this file. Does it have any type of structure? Does it
use field delimiters?


Paul ~~~ (e-mail address removed)
Microsoft MVP (Visual Basic)
 
logfiles follow a different formats depending on the various supported
standards.


If you are looking to parse the log file for errors or just security
concerns, you should instead resort to Microsoft's log parser utility which
ships free as an IIS resource download instead of re-inventing the wheel to
implement your own functionality. It contains the ability to parse logfiles
by firing legitimate sql queries against the raw text file.



It's quite a remarkable piece of machinery actually. The log parser utility
is a command line tool with no GUI interface so you have to learn all the
nuts and bolts. But i've built a web GUI front end that sits on top of
logparser available on www.logparser.com. Serverstats, which is open source
by the way, eliminates the need for the user to learn sql queries and the
nuts and bolts of the logparser utility which can be a bit intimidating. You
can point and click your way thru any w3c extended log file format.


--
Regards,
Alvin Bruney [Microsoft MVP ASP.NET]

[Shameless Author plug]
The Microsoft Office Web Components Black Book with .NET
Now Available @ http://www.lulu.com/owc
 
On Mon, 14 Feb 2005 13:47:30 -0500, "Alvin Bruney [MVP]" <vapor at steaming post office> wrote:

Alvin,

¤ logfiles follow a different formats depending on the various supported
¤ standards.
¤
¤
¤ If you are looking to parse the log file for errors or just security
¤ concerns, you should instead resort to Microsoft's log parser utility which
¤ ships free as an IIS resource download instead of re-inventing the wheel to
¤ implement your own functionality. It contains the ability to parse logfiles
¤ by firing legitimate sql queries against the raw text file.
¤
¤
¤
¤ It's quite a remarkable piece of machinery actually. The log parser utility
¤ is a command line tool with no GUI interface so you have to learn all the
¤ nuts and bolts. But i've built a web GUI front end that sits on top of
¤ logparser available on www.logparser.com. Serverstats, which is open source
¤ by the way, eliminates the need for the user to learn sql queries and the
¤ nuts and bolts of the logparser utility which can be a bit intimidating. You
¤ can point and click your way thru any w3c extended log file format.

Do you know what the file extensions for these log files are?


Paul ~~~ (e-mail address removed)
Microsoft MVP (Visual Basic)
 
..w3c for the w3c standard. if you open inetmanager and right click default
websites. Under websites tab, there is a logging checkbox. Click the
properties tab to find where the files are on the system and what format it
is using. w3c is the default.

--
Regards,
Alvin Bruney [Microsoft MVP ASP.NET]

[Shameless Author plug]
The Microsoft Office Web Components Black Book with .NET
Now Available @ http://www.lulu.com/owc
 
Guoqi Zheng said:
I need to process our web server log file every day. Normally, this
file is about 50MB - 100s MB big. How can I read the file in a
effective way.

I can open a streamreader to read it and use readline() function to
read line by line and process them. However, this sounds very slow. I
can also use readtoend to read everything into a string, but then I
am afraid that it will be too big for memory.

"This sounds very slow" isn't a good basis on which to make a
judgement. Have you tried it? ReadLine is actually probably the best
way to go if you can get away with processing a line at a time.
 
Actually, i've gone a little further with the idea. Logparser makes it
possible to treat any text file as a database. i take a dataset or xml file
and rewrite them as w3c files (just headers at the top really and space
delimited). then i use log parser to allow me to query the text file just
like a database. pretty nifty approach when there isn't a real database
close by or going cross platform, or i don't want to use a memory intensive
tool like an XML parser.

--
Regards,
Alvin Bruney [Microsoft MVP ASP.NET]

[Shameless Author plug]
The Microsoft Office Web Components Black Book with .NET
Now Available @ http://www.lulu.com/owc
 
Back
Top