Is there a way to keep the StreamWriter open?

  • Thread starter Thread starter Bishop
  • Start date Start date
B

Bishop

Here is my current code:

Imports System.IO

Partial Class _Default

Inherits System.Web.UI.Page



Protected Sub Page_Load(ByVal sender As Object, ByVal e As
System.EventArgs) Handles Me.Load

Dim FILENAME As String =
"C:\inetpub\wwwroot\zzJunkWriteTextFile\Log\output.csv"

Dim objStreamWriter As StreamWriter

objStreamWriter = File.AppendText(FILENAME)

objStreamWriter.WriteLine(Request.RawUrl & "," &
DateTime.Now.ToString())

objStreamWriter.Close()

End Sub

End Class





Every time some hits this page it writes a line to a text file. I wanted to
find out if I can keep the StreamWriter open to speed up the process?
 
Here is my current code:

Imports System.IO

Partial Class _Default

    Inherits System.Web.UI.Page

    Protected Sub Page_Load(ByVal sender As Object, ByVal e As
System.EventArgs) Handles Me.Load

        Dim FILENAME As String =
"C:\inetpub\wwwroot\zzJunkWriteTextFile\Log\output.csv"

        Dim objStreamWriter As StreamWriter

        objStreamWriter = File.AppendText(FILENAME)

        objStreamWriter.WriteLine(Request.RawUrl & "," &
DateTime.Now.ToString())

        objStreamWriter.Close()

    End Sub

End Class

Every time some hits this page it writes a line to a text file.  I wanted to
find out if I can keep the StreamWriter open to speed up the process?

Hi, not in such way that it would be efficient. Better way is to use
database etc which is optimized for data handling.

Teemu Keiski
ASP.NET MVP
 
Short answer: No.

Your server-side code (the code behind part) is run on the server (as the
name implies) and ends as soon as the page is transmitted to the browser.
When any client (the current visitor or a new one) requests a page, the
entire process is repeated. Any objects that you create or reference during
that page processing are destroyed whenever the page processing finishes (in
theory at least).
 
The reason I want to write to a text file is because the database isn't able
to keep up with the heavy traffic. My desire is to have this page write CSV
files that I can later bulk insert to SQL.



I write a .Net Windows service that would keep the file open and write to it
but it there a way that I can communicate with the service without having to
instantiate an object each time?




Here is my current code:

Imports System.IO

Partial Class _Default

Inherits System.Web.UI.Page

Protected Sub Page_Load(ByVal sender As Object, ByVal e As
System.EventArgs) Handles Me.Load

Dim FILENAME As String =
"C:\inetpub\wwwroot\zzJunkWriteTextFile\Log\output.csv"

Dim objStreamWriter As StreamWriter

objStreamWriter = File.AppendText(FILENAME)

objStreamWriter.WriteLine(Request.RawUrl & "," &
DateTime.Now.ToString())

objStreamWriter.Close()

End Sub

End Class

Every time some hits this page it writes a line to a text file. I wanted
to
find out if I can keep the StreamWriter open to speed up the process?

Hi, not in such way that it would be efficient. Better way is to use
database etc which is optimized for data handling.

Teemu Keiski
ASP.NET MVP
 
Every time some hits this page it writes a line to a text file. I
wanted to find out if I can keep the StreamWriter open to speed up the
process?

Why are you using a streamwriter to log? There are much better ways to do
that, including some "built in".

--
Gregory A. Beamer
MVP; MCP: +I, SE, SD, DBA

Twitter: @gbworld
Blog: http://gregorybeamer.spaces.live.com

*******************************************
| Think outside the box! |
*******************************************
 
Logging marketing data to a CSV file. We then need to massage the data, and
perform a bulk insert.
 
Logging marketing data to a CSV file. We then need to massage the
data, and perform a bulk insert.

You can use pretty much any of the logging libraries to do this rather
efficiently. I would also argue that it might be more prudent to log to a
database and then move over to your database where you are having your
marketing guys run reports.

Leaving the StreamWriter open is an option, I suppose, but you are holding
resources for a long time, which is not an option I would like.


--
Gregory A. Beamer
MVP; MCP: +I, SE, SD, DBA

Twitter: @gbworld
Blog: http://gregorybeamer.spaces.live.com

*******************************************
| Think outside the box! |
*******************************************
 
The database can't keep up, which is why I wanted to write to a text file.

The plan would be to create new files on the regular basis, possiably every
min depending on the traffic. When the new file is created the closed files
would be moved to another server for processing and inserted into our
database.

How would I go about keeping it open. Would using an application variable
work or this there a better way?
 
The database can't keep up, which is why I wanted to write to a text file..

The plan would be to create new files on the regular basis, possiably every
min depending on the traffic. When the new file is created the closed files
would be moved to another server for processing and inserted into our
database.

How would I go about keeping it open.  Would using an application variable
work or this there a better way?








- Show quoted text -

Here is a built-in option: http://msdn.microsoft.com/en-us/library/bb398933..aspx

Health monitoring is extremely powerful and extensible. While I mainly
use it to track errors, you could use one of the other providers, such
as http://msdn.microsoft.com/en-us/library/system.web.management.sqlwebeventprovider.aspx,
to track request events instead. (http://msdn.microsoft.com/en-us/
library/system.web.management.webrequestevent.aspx). If you use just
the builtin events and providers, you could have request tracking up
in ~30 min. It would merely require running aspnet_regsql for the sql
provider and modifying the web.config.

In addition, I would suggest implementing it for error logging, it has
saved my ass a few times.

Hope this helps!

- Nathan "Norm" Cota
 
The database can't keep up, which is why I wanted to write to a text
file.


Now that I know this, I would try to figure out why the database is not
keeping up. Is it because you have so many users hitting the site that
you are overtaxing the hardware, or could it be something problematic
with your data access methdology. How many users are using the site at
any one time?

There are lots of ways of speeding up data input, even logging type
input of data.

1. Add bigger hardware
2. Beef up machines
3. Create a cluster
4. Separate data storage from data server
5. Get rid of drag and drop bits from your code
6. Set up an asynch method to store in the database

When you say the database can't keep up, I have one question: Does the
database not keeping up stop users from accessing other data? If no,
then the database keeping up is not the real issue. It is more likely
the way things are coded.

What I am getting at here, is beefing up the hardware may be a better
option than kludging together some "offline data store" in a text file
that is later uploaded to the database. Unless this data is uploaded to
another database, you are not really solving the problem, at least not
long term, and you may eventually hit a much harder wall than you are
hitting right now.

The plan would be to create new files on the regular basis, possiably
every min depending on the traffic. When the new file is created the
closed files would be moved to another server for processing and
inserted into our database.

What data are you tracking? The reason I ask is the basic information,
user IP, page hit, etc. are already tracked in IIS.
How would I go about keeping it open. Would using an application
variable work or this there a better way?


You can create a static class that holds the StreamWriter and continue
to flow data in this way, but that is only part of the problem you need
to solve. If you go this direction, you also need to solve how to
gracefully shut down the file you are writing to and then open a new
StreamReader to start handling the current hits.

I see this as a train wreck waiting to happen. If you have too much data
for the database, keeping a streamwriter open is probably going to fail
as well.

My suggestion would be to pull in an architecture consultant for a few
days and have them help deconstruct the current architecture and define
a new solution. It may end up costing a few thousand, but if you already
have a scale problem, it may be the best few thousand your company ever
spends.


--
Gregory A. Beamer
MVP; MCP: +I, SE, SD, DBA

Twitter: @gbworld
Blog: http://gregorybeamer.spaces.live.com

*******************************************
| Think outside the box! |
*******************************************
 
Here is a built-in option:
http://msdn.microsoft.com/en-us/library/bb398933.aspx

Health monitoring is extremely powerful and extensible. While I mainly
use it to track errors, you could use one of the other providers, such
as
http://msdn.microsoft.com/en- us/library/system.web.management.sqlwebeve
ntprovider.aspx, to track request events instead.
(http://msdn.microsoft.com/en-us/
library/system.web.management.webrequestevent.aspx). If you use just
the builtin events and providers, you could have request tracking up
in ~30 min. It would merely require running aspnet_regsql for the sql
provider and modifying the web.config.

In addition, I would suggest implementing it for error logging, it has
saved my ass a few times.

Hope this helps!

- Nathan "Norm" Cota

I think these are great suggestions. Health monitoring, and
instrumentation in general, are good things.

But health monitoring is not the real issue here. It is scaling the
application, at least from what I read. He is pumping so much data into
the database that he has overtaxed the system.


--
Gregory A. Beamer
MVP; MCP: +I, SE, SD, DBA

Twitter: @gbworld
Blog: http://gregorybeamer.spaces.live.com

*******************************************
| Think outside the box! |
*******************************************
 
Thanks for the info and suggestions.

I'm doing this to log cookie tracking data. We will have tens of millions
of records each day, at minimum and potentially hundreds of millions.

Even when only using minimal indexing it's going to be a lot more resource
expensive to write transitionally to a SQL table as opposed to a text file.
I am concerned with timeouts but also with slow response time.

The data is initially logged and then needs to be massaged quite a bit
before it is stored in it's final location.

This server is dedicated to this one task and the end data will not reside
on it, if I can make it work using a text file scenario I won't even need
SQL installed on this server.

According to PerfMon the bottleneck has always been the disk. Since this is
almost an entire write only scenario for SQL I setup separate physical disks
for the OS, TempDB, Database, Logfile

I could consider moving to a flash drive but since there is a lot of
massaging to the data, working with the text files first doesn't seem to bad
to me.
 
Thanks for the info and suggestions.

I'm doing this to log cookie tracking data. We will have tens of
millions of records each day, at minimum and potentially hundreds of
millions.

Even when only using minimal indexing it's going to be a lot more
resource expensive to write transitionally to a SQL table as opposed
to a text file. I am concerned with timeouts but also with slow
response time.

The data is initially logged and then needs to be massaged quite a bit
before it is stored in it's final location.

This server is dedicated to this one task and the end data will not
reside on it, if I can make it work using a text file scenario I won't
even need SQL installed on this server.

According to PerfMon the bottleneck has always been the disk. Since
this is almost an entire write only scenario for SQL I setup separate
physical disks for the OS, TempDB, Database, Logfile


Based on what you have written thus far, having the indexes on a different
partition might also help.
I could consider moving to a flash drive but since there is a lot of
massaging to the data, working with the text files first doesn't seem
to bad to me.


There are two different problems:

1. Getting the data from the cookies and into a data store
2. Reporting from the data store

They do not have to be solved in the same database. In fact, if you are
really drilling into the data, the reporting is probably better off of a
data warehouse, depending on the data mining necessary.

If the disk is the bottleneck, there are solutions for that. Reducing index
on the database you are inputting to is certainly one of the steps. In
fact, other than the index on the primary key, which is most likely a
derived key, there is no need to have any indexes to slow things down.

If the issue is largely with peak times, you could consider using something
like SQL broker or even Message Queue to have a queueing mechanism for
input into the database.

Faster drives are another option, as is partititioning out the database. It
is fairly amazing how much increasing the RPM helps. Having a much larger
cache on the drive also works wonders. Ultiamtely moving to a SAN is often
the best option.

Now, with the millions of rows per day, you are still going to have some
issue with a file, as it gets huge rather quickly. But you are already
experiencing that, which is why you want to keep the writer open. And, yes,
it is possible to set up a service that you simply throw things at and keep
the writer open. But, as I mentioned, the controller that flips to
different files over time is still a very large piece that will need to be
architected correctly. If you can figure out how to solve the database
issue, that would be better.


--
Gregory A. Beamer
MVP; MCP: +I, SE, SD, DBA

Twitter: @gbworld
Blog: http://gregorybeamer.spaces.live.com

*******************************************
| Think outside the box! |
*******************************************
 
Are you suggesting that using the same hardware but with good SQL
configuration I can expect the same speed writing transactionally to SQL
that I can writing to a text file with some decient programming wrapped
around it?

My gut feeling is that it would take good SQL configuration and better
hardware and it would be able to keep up but not be faster. If that's the
case I fail to see any big advantage in writing directly to SQL, at least
from a financial standpoint assuming the cost for SQL and better hardware.

I'm not trying to argue, but everyone seems to think I should solve my
problem using anything other than writing to a text file. I completly agree
that the process would require less programming and be easier to accomplish
if I was able to write directly to SQL, I'm just not convinced it's the
right solution.

I've read many times that you shouldn't write your IIS logs directly to SQL
if you have a busy website because it's so much faster to write them to the
text file. My process seems to be very simular to IIS logging.

Thank you again for your suggestions, I do appreciate it.
 
Are you suggesting that using the same hardware but with good SQL
configuration I can expect the same speed writing transactionally to
SQL that I can writing to a text file with some decient programming
wrapped around it?


It is possible. I am not sure how much data you have a on a single row,
but I have worked on systems that performed millions of transactions a
day without the type of performance you are talking about.

Here are a couple of things I have thought about

1. Look at your insert method. It should be very lean. Example:

a) Load variables you are tracking into sql parameter objects
b) Instantiate connection and command with parameters
c) Fire off ExecuteNonQuery()

If you make hte insert lean (no LINQ, EF, DataSet/TableAdapter), things
will go much faster, esp. if the inserted database is just an insert and
you are not reporting off of it

2. Reduce the amount of information per insert

If you are inserting the same information over and over again for a
particular user, you can normalize the data a bit so you are merely
tracking new information. That is just one example.

Not knowing how you are inserting, I am not sure how much can be sped
up, but the thinner the data layer is, the faster it operates.

My gut feeling is that it would take good SQL configuration and better
hardware and it would be able to keep up but not be faster. If that's
the case I fail to see any big advantage in writing directly to SQL,
at least from a financial standpoint assuming the cost for SQL and
better hardware.


And this may be, in your particular case. The question is whether the
company spends money on better hardware and some code tweaks or it
spends it to have a logging mechanism built, along with some custom bulk
ETL software. The big question is whether they see your time as a fixed
cost or not.

I'm not trying to argue, but everyone seems to think I should solve my
problem using anything other than writing to a text file. I completly
agree that the process would require less programming and be easier to
accomplish if I was able to write directly to SQL, I'm just not
convinced it's the right solution.


I am going on my experience. I also have a lack of knowledge of the
particulars of your system, which I admit may be a stumbling block for
me. In general, I would not head the direction you are heading.

If buying additional hardware is out, it is a constraint you have to
live with. Perhaps thinning out data access will help SQL keep up, but
it might not. You always have the file log direction to head.

But, as you have already expressed, you will end up with a management
piece on top of the logger. The reason for this is you will have to
close out logs eventually to upload. And, as you have expressed, it
might be every couple of hours. If this is not automated , it can
consume a great deal of your time.

I've read many times that you shouldn't write your IIS logs directly
to SQL if you have a busy website because it's so much faster to write
them to the text file. My process seems to be very simular to IIS
logging.

It is, in some ways, but the process writing the IIS logs is optimized
for large logs. In addition, it sounds like your reporting needs are a
bit different.

I imagine you might be able to find some type of third party control
that can set aside space for a file and fill it in a non "line by line"
manner. I am not sure where you would search for that. It will end up as
a native Windows component.
Thank you again for your suggestions, I do appreciate it.


No problem. I enjoy hearing issues and problems and putting my head
around them. ;-)


--
Gregory A. Beamer
MVP; MCP: +I, SE, SD, DBA

Twitter: @gbworld
Blog: http://gregorybeamer.spaces.live.com

*******************************************
| Think outside the box! |
*******************************************
 
Back
Top