Performance and Memory Usage Consideration about DataTable and Dat

  • Thread starter Thread starter Guest
  • Start date Start date
G

Guest

Hi,

How do I free up the memory taken up by the .NET datatable and dataset after
I finished working with them? How do I HELP the GC to do its job sooner to
release memory taken up by DataTable/DataSet?

I am worried about memory usage. Should I be using DataReader exclusively?
But complex databinding is going to cause problem to me because it is going
to keep the connection open for longer than necessary.

Would it help if I create an arraylist of DbDataRecord from DataReader
instead?

If reader.HasRows Then
For Each rec As System.Data.Common.DbDataRecord In reader
arr.Add(rec)
Next
End If

I am guessing that arraylist takes up less memory than a DataTable. Am I
correct? Or is there a way for me to check how much memory each takes up
(datatable v.s. Arraylist of DbDataRecord)?

Then, how do I use this technique to construct more than one arraylist when
the datareader returns more than one result set? I can only get the first
resultset and i don't know why. It seems that the reader is closed
automatically when I use this technique.

Another question I have is about DataTable's DataTableReader. What's the
point to use a DataTableReader? Does it make databinding faster than binding
to Datatable itself? I am sure this would not take less memory so why?

Thanks for any advice you can provide!
 
You can't data bind to a DataReader; it only reads one record at a time. It
reads fast, but you have no retainability unless you store the rows
somewhere.

You can convert a datareader into a datatable, but why not just read it as
a DataTable.

Why are you so concerned about the memory taken up by the datatable? Once
it goes out of scope, the GC will know it can be cleared. If it has a
Dispose method, you can invoke it and the GC might dispose of it sooner. I
can't believe an array of the same information would take up a lot less
information. After all a data table is an array of DataRows.

That's my two cents' worth.
Robin S.
 
?? Why can't you databind a datareader ??

And datatable is not just an array of datarows.

And DataTable's Dispose method does nothing at all because it is a managed
object. The method is there simply because it inherits
MarchalByValueComponent.

I am concerned about memory usage because my application is currently being
used on busy websites and I need to make sure everything works smoothly. On a
typical day there are more than 10,000 records being inserted into a table
and the data are constantly queried and databound and there are thousands of
concurrent users. This is why I am much more concerned about memory usage
than anything.
 
Hi there,

Your situation is mostly theoretical one as you shouldn't load tons of rows
anyway.
And letting GC to do its job is just fine, it will take care of it as soon
as the instance goes out of the scope (you might set the variable to null
asap to made it available to GC sooner).
The only thing you might do is to force garbage collection but for doing
that you should really have a good reason.
 
Hi Robin,

Few minor corrections inline
Why are you so concerned about the memory taken up by the datatable? Once
it goes out of scope, the GC will know it can be cleared. If it has a
Dispose method, you can invoke it and the GC might dispose of it sooner.

No, it won't. Dispose doesn't affect the garbage collection timing at all -
at least for the instance Dispose is called on.

I
can't believe an array of the same information would take up a lot less
information. After all a data table is an array of DataRows.

DataTable is a bit more than that. I guess if we want to find the difference
in size we should load same rows in table and into an array, serialize both
and compare its sizes (or compare sizes using a memory profiler)
 
Thanks Miha,

I am the paranoid type :)

And you are correct: I am not loading a ton of rows and there is no serious
data manipulation on databinding, so DataTable should work just fine for me.

I found that binding to ArrayList is faster than DataTable, but there is a
problem about multiple resultset reader as I can't get more than one
Arraylist using the following code:

Do

Dim arr As New ArrayList

If reader.HasRows Then
For Each rec As System.Data.Common.DbDataRecord
In reader
arr.Add(rec)
Next
End If

allHoldingData.Add(arr)

Loop While reader.NextResult

I am unable to use this technique to get more than one ArrayList (for
multiple resultset reader). Is it the limitation of the GetEnumerator method?
 
Samuel said:
Thanks Miha,

I am the paranoid type :)

A healthy dose of paranoia is good :-)
And you are correct: I am not loading a ton of rows and there is no
serious
data manipulation on databinding, so DataTable should work just fine for
me.

I found that binding to ArrayList is faster than DataTable, but there is a
problem about multiple resultset reader as I can't get more than one
Arraylist using the following code:

Do

Dim arr As New ArrayList

If reader.HasRows Then
For Each rec As System.Data.Common.DbDataRecord
In reader
arr.Add(rec)
Next
End If

allHoldingData.Add(arr)

Loop While reader.NextResult

I am unable to use this technique to get more than one ArrayList (for
multiple resultset reader). Is it the limitation of the GetEnumerator
method?

Not sure. Is NextResult always returning false?
And btw, I prefer DataTable over an array because the performance is
marginally different (and most of the time you loose fetching data through
the network) and it is less code.
 
Ah, but in ASP net you can bind to a DataReader--just not in a Windows Forms
application.
Before you go off writing a bunch of code to handle rowsets or multiple
resultsets containing rowsets and populating arrays, consider that MS has
spent considerable time on performance in this area. ADO.NET can populate
multiple arrays for you in a single line of code: Use the DataAdapter Fill
method. It deals with a litany of issues as well--including connection
management (the bane of many an ASP app). You say a DataTable is not an
array? Ah, behind the scenes it's not really much more than an untyped
array. You can also load a DataTable with one line of code: the new Load
method works better than any manual DataTable populator I've seen.

Typically, more performance can be gained in the questions you ask (your
queries) than all of the fancy footwork dealing with the answer (the
rowsets) that's returned.

hth

--
____________________________________
William (Bill) Vaughn
Author, Mentor, Consultant
Microsoft MVP
INETA Speaker
www.betav.com/blog/billva
www.betav.com
Please reply only to the newsgroup so that others can benefit.
This posting is provided "AS IS" with no warranties, and confers no rights.
__________________________________
Visit www.hitchhikerguides.net to get more information on my latest book:
Hitchhiker's Guide to Visual Studio and SQL Server (7th Edition)
and Hitchhiker's Guide to SQL Server 2005 Compact Edition (EBook)
-----------------------------------------------------------------------------------------------------------------------
 
Samuel said:
?? Why can't you databind a datareader ??

And datatable is not just an array of datarows.

No, it's not *just* an array of data rows. But it can be treated as an
array of data rows.
And DataTable's Dispose method does nothing at all because it is a
managed
object. The method is there simply because it inherits
MarchalByValueComponent.

I am concerned about memory usage because my application is currently
being
used on busy websites and I need to make sure everything works smoothly.
On a
typical day there are more than 10,000 records being inserted into a
table
and the data are constantly queried and databound and there are thousands
of
concurrent users. This is why I am much more concerned about memory usage
than anything.

I didn't realize you were doing a web application, and stand corrected per
the other posts. Good luck with your application.

Robin S.
 
Miha Markic said:
Hi Robin,

Few minor corrections inline


No, it won't. Dispose doesn't affect the garbage collection timing at
all - at least for the instance Dispose is called on.

I didn't say that Disposing of the object would make the GC dispose of it
right away. I said it might help it dispose of it sooner, because it will
know it can be disposed of. So you chide me for saying that, and then you
post this:
And letting GC to do its job is just fine, it will take care of it as soon
as the instance goes out of the scope (you might set the variable to null
asap to made it available to GC sooner).

which seems terribly similar to what I said, except you tell him to set the
variable to null instead of calling the Dispose() method. I don't even know
if a datatable *has* a dispose method. The point was, make sure the GC
knows it *can* be disposed of.
I

DataTable is a bit more than that. I guess if we want to find the
difference in size we should load same rows in table and into an array,
serialize both and compare its sizes (or compare sizes using a memory
profiler)

Yes, that's an excellent idea. And I know the DataTable is more than that,
but you have to admit, it also acts as an array of data rows, in addition
to all the other stuff it does. ;-)

Robin S.
 
William (Bill) Vaughn said:
Ah, but in ASP net you can bind to a DataReader--just not in a Windows
Forms application.

Thanks, Bill, I stand corrected. I didn't know you could do that, because I
don't do ASP (yet).

Robin S.
 
RobinS said:
I didn't say that Disposing of the object would make the GC dispose of it
right away. I said it might help it dispose of it sooner, because it will
know it can be disposed of. So you chide me for saying that, and then you
post this:


which seems terribly similar to what I said, except you tell him to set
the variable to null instead of calling the Dispose() method. I don't even
know if a datatable *has* a dispose method. The point was, make sure the
GC knows it *can* be disposed of.

No, no, GC doesn't care about Dispose method at all. The only thing that
makes a garbage collection faster (and is relevant to garbage collection) is
to eliminate the outer references to the graph - in our case we eliminate
the only reference, that is setting variable to null.
BTW, Dispose is mostly used to release unmanaged resources.
Yes, that's an excellent idea. And I know the DataTable is more than that,
but you have to admit, it also acts as an array of data rows, in addition
to all the other stuff it does. ;-)

Sure, I didn't say that it isn't some sort of an array internally.
 
Miha Markic said:
No, no, GC doesn't care about Dispose method at all. The only thing that
makes a garbage collection faster (and is relevant to garbage collection)
is to eliminate the outer references to the graph - in our case we
eliminate the only reference, that is setting variable to null.
BTW, Dispose is mostly used to release unmanaged resources.

Thanks for clarifying that for me.


Robin S.
 
Hi,

I agree with Miha. DataTable will provide more features for you than
ArrayList.
I am unable to use this technique to get more than one ArrayList (for
multiple resultset reader). Is it the limitation of the GetEnumerator
method?

Did you receive an error message after executing reader.NextResult? I have
tested with your code just now, however it works fine for me so far. Would
you please past some exact error message? So that we can performance more
analyse.

Have a great day,
Sincerely,
Wen Yuan
 
Robin,

In addition to everyone's reply,
1. Yes, you can bind controls to datareader (ASP.NET only),
2. Yes you can read many results sets (see NextResult() method)
3. Have in mind DataReader is used internally in DbDataAdapter.Fill method
for populating the datatable(s) anyway.

Regards
 
Saw this thread and had to reply, I have been dealing with datatables myself lately and have had to optimize some poor code on a very high traffic website. The MSDN seems to disagree with you (see ref: http://msdn.microsoft.com/en-us/magazine/cc163854.aspx) about garbage collection.

You always close your connections and dispose of your objects as soon as you are done with them. You can get away with not doing this for certain apps or websites that are low volume but high capacity will always reveal the flaws in your code. What works for 500 requests a second will not translate to 6500 for example.

I just had a situation where I load tested a page where the developers used datatables all over the place and never called .dispose. At 900 req/sec to a single Server 2008 box many 500 server errors were generated. I added calls to .dispose for every datatable and reran the load test, no more errors. Why? because garbage collection is not to be relied on for high performance code, you can never be sure when your resources will be freed. It's best to dispose of your objects immediately after you're finished with them.

In my case not closing the datatables was using up too many connections in the connection pool and not freeing them for use fast enough. Now I could have raised the maximum # of connections on the pool (which is what was done originally), but as you scale out that becomes a problem because now these connections are being held open to the sql server(not closing fast enough) and with enough traffic will crush sql server, so raising the # of connections shifts the load to sqlserver which is not appropriate.

Using .dispose not only benefits SQL Server but it will help control the resources of the web server in a more efficient manner. Remember best practices people, quick and dirty gets the job done quickly but not correctly.
Hi,

How do I free up the memory taken up by the .NET datatable and dataset after
I finished working with them? How do I HELP the GC to do its job sooner to
release memory taken up by DataTable/DataSet?

I am worried about memory usage. Should I be using DataReader exclusively?
But complex databinding is going to cause problem to me because it is going
to keep the connection open for longer than necessary.

Would it help if I create an arraylist of DbDataRecord from DataReader
instead?

If reader.HasRows Then
For Each rec As System.Data.Common.DbDataRecord In reader
arr.Add(rec)
Next
End If

I am guessing that arraylist takes up less memory than a DataTable. Am I
correct? Or is there a way for me to check how much memory each takes up
(datatable v.s. Arraylist of DbDataRecord)?

Then, how do I use this technique to construct more than one arraylist when
the datareader returns more than one result set? I can only get the first
resultset and i don't know why. It seems that the reader is closed
automatically when I use this technique.

Another question I have is about DataTable's DataTableReader. What's the
point to use a DataTableReader? Does it make databinding faster than binding
to Datatable itself? I am sure this would not take less memory so why?

Thanks for any advice you can provide!
On Friday, March 23, 2007 12:35 AM RobinS wrote:
You can't data bind to a DataReader; it only reads one record at a time. It
reads fast, but you have no retainability unless you store the rows
somewhere.

You can convert a datareader into a datatable, but why not just read it as
a DataTable.

Why are you so concerned about the memory taken up by the datatable? Once
it goes out of scope, the GC will know it can be cleared. If it has a
Dispose method, you can invoke it and the GC might dispose of it sooner. I
can't believe an array of the same information would take up a lot less
information. After all a data table is an array of DataRows.

That's my two cents' worth.
Robin S.
On Friday, March 23, 2007 4:29 AM Miha Markic [MVP C#] wrote:
Hi there,

Your situation is mostly theoretical one as you shouldn't load tons of rows
anyway.
And letting GC to do its job is just fine, it will take care of it as soon
as the instance goes out of the scope (you might set the variable to null
asap to made it available to GC sooner).
The only thing you might do is to force garbage collection but for doing
that you should really have a good reason.

--
Miha Markic [MVP C#, INETA Country Leader for Slovenia]
RightHand .NET consulting & development www.rthand.com
Blog: http://cs.rthand.com/blogs/blog_with_righthand/
news:[email protected]...
On Friday, March 23, 2007 4:33 AM Miha Markic [MVP C#] wrote:
Hi Robin,

Few minor corrections inline


No, it won't. Dispose doesn't affect the garbage collection timing at all -
at least for the instance Dispose is called on.

I

DataTable is a bit more than that. I guess if we want to find the difference
in size we should load same rows in table and into an array, serialize both
and compare its sizes (or compare sizes using a memory profiler)
On Friday, March 23, 2007 6:52 AM Miha Markic [MVP C#] wrote:

A healthy dose of paranoia is good :-)


Not sure. Is NextResult always returning false?
And btw, I prefer DataTable over an array because the performance is
marginally different (and most of the time you loose fetching data through
the network) and it is less code.
On Friday, March 23, 2007 4:28 PM RobinS wrote:
"Miha Markic [MVP C#]" <miha at rthand com> wrote in message

I didn't say that Disposing of the object would make the GC dispose of it
right away. I said it might help it dispose of it sooner, because it will
know it can be disposed of. So you chide me for saying that, and then you
post this:


which seems terribly similar to what I said, except you tell him to set the
variable to null instead of calling the Dispose() method. I don't even know
if a datatable *has* a dispose method. The point was, make sure the GC
knows it *can* be disposed of.


Yes, that's an excellent idea. And I know the DataTable is more than that,
but you have to admit, it also acts as an array of data rows, in addition
to all the other stuff it does. ;-)

Robin S.
On Friday, March 23, 2007 6:05 PM Miha Markic [MVP C#] wrote:

No, no, GC doesn't care about Dispose method at all. The only thing that
makes a garbage collection faster (and is relevant to garbage collection) is
to eliminate the outer references to the graph - in our case we eliminate
the only reference, that is setting variable to null.
BTW, Dispose is mostly used to release unmanaged resources.


Sure, I didn't say that it isn't some sort of an array internally.
On Saturday, March 24, 2007 10:54 PM RobinS wrote:
"Miha Markic [MVP C#]" <miha at rthand com> wrote in message

Thanks for clarifying that for me.


Robin S.
 
Back
Top