Garbage Collection with Weak References

  • Thread starter Thread starter Derrick
  • Start date Start date
D

Derrick

I'm loading a boatload of data into a DataSet. The memory usage grows and
grows for the app while loading that data. Calling GC.Collect() reduces the
consumption slightly. When I minimize the app though, the usage goes to
about 500k, and then grows when maximizing the app and working with DataSet.
The DataSet still appears to have all data when mem footprint was many many
megs at the end of loading it.

Two questions:
1) I'm guessing DataSet employs weak references to all data, and that weak
referenced data is GC'ed when the app minimizes. Can anyone confirm/deny
that?
2) If that is the case, is there a "GC.AgressivelyCollect()" type call I
can make while I'm loading data so the footprint stays in a somewhat sane
range?

Thanks in advance!

Derrick
 
1) I'm guessing DataSet employs weak references to all data, and that weak
referenced data is GC'ed when the app minimizes. Can anyone confirm/deny
that?

This is not the case. If the dataset had weak references that were
collected, how would the dataset re-aquire the data?
 
Unless you are running out of memory, I would advise that you let the GC do
it's job uninterrupted and without intervention. The application is going
to use as much memory as it needs, but the GC will automatically start
collecting more aggressively if system memory comes under pressure.

Oh, one other thing: Manually calling GC.Collect() might have some
unintended consquences for you: depending upon whats going on with your app
at the time that you call it, that might actually cause memory to be
released less efficiently. The GC operates on the principle that longer
lived objects are collected less frequently than shortlived objects, so each
time an object survives a GC (because there is a reference to it), it is
promoted to a higher generation. There are 3 generations: 0, 1, and 2. The
GC won't try to collect an object in generation 1 unless not enough memory
was released with a gen 0 collection. Gen 2 collections are even more rare,
and are very expensive, as the GC must freeze all threads and examine every
object in memory. By calling GC.Collect() manually, you might be
artificially promoting some of your objects to Gen 1 or even Gen 2, and
thereby increasing the amount of time that the memory will be held (and
increasing the expense of destroying them)

Rico Mariani (performance optimizer guy for MS) has an interesting blog
entry on the topic:
http://weblogs.asp.net/ricom/archive/2003/12/02/40780.aspx
 
What's going on if it's not GarbageCollection then? Can you explain the
memory usage behavior I'm observing? It does not make sense to me..

Looking at app (release build, simple windows forms app) via task manager,
App startup, mem usage 10 megs
I load a 20 meg xml file into a dataset, mem usage about 60 megs.
Minimize the app, mem usage 500k
Maximize the app, mem usage 2 megs.
Navigate data via forms app, all data still available in the DataSet, mem
usage goes up slowly
Minimize the app, mem usage back down to 500k
Maximize the app, mem usage 2 megs.

In a previous post I provide sample code to reproduce this, let me know if
you'd like me to post again..

Thanks -

Derrick
 
Thanks for the info, I had been calling GC.Collect with 0 after reading a
little today, still didn't help.

I still don't understand why the mem usage stat in windows task manager
drops to almost nothing when the app is minimized, and then grows relatively
slowly upon maximizing and navigating data. Is that expected?

I'm not running out of memory on my machine, but am worried about clients
running on older machines with much less memory available. The dataset I'm
prototyping with is one of our smaller sets, xml file about 20m. They can
get up to 300m....
 
Derrick said:
Thanks for the info, I had been calling GC.Collect with 0 after reading a
little today, still didn't help.

I still don't understand why the mem usage stat in windows task manager
drops to almost nothing when the app is minimized, and then grows
relatively
slowly upon maximizing and navigating data. Is that expected?

I'm not running out of memory on my machine, but am worried about clients
running on older machines with much less memory available. The dataset
I'm
prototyping with is one of our smaller sets, xml file about 20m. They can
get up to 300m....




The OS will trim the workingset of all windows programs when minimized, this
has nothing to do with .NET.
When memory becomes scarce, the OS will also trim the WS of all processes.
From the applications perspective, there is no need to call GC.Collect, and
reducing the WS of an active process is also a bad idea, as it often results
in a page-out sequence followed by a page-in, so in short a lot of
unnecessary IO.

Now back to your xml file, if you really intend to process such large xml
files in a timely manner (that is without paging), you will have to change
your design, or you will need a lot of memory (>1GB), GC.Collect nor
reducing the WS will help you out.

Willy.
 
Well, it could be that the initial parse of the XML is costing you temporary
memory. As Willy noted, the O/S will reclaim memory on a minimize.

Are these XML files just serialized datasets, or are they XML files that you
have created, and are "manually" reading into datasets? If it's the latter,
you might try using some different technology to read the XML files. If you
load it up as a DOM object, (at least, this is the way the COM XML parsers
work, someone correct me if .Net is different), the whole document must be
loaded into memory and parsed, so large XML files can be very expensive.
You might try an alternative technology, like SAX, which loads the XML
progressively, and fires events as new elements are read.
 
What you are observing is not the amount of managed heap in use by your
application. The amount of memory reported by Task Manager has more to do
with your working set than anything else, which is a very coarse
measurement. When a windowed application is minimized, the operating system
will collapse the working set - that's the behavior you're seeing.

If you really want to understand what's going on with your memory
allocations, use a tool like the CLR profiler available at:
www.gotdotnet.com
 
Back
Top