IDisposable, using(), RAII and structs [Discussion]

  • Thread starter Thread starter codymanix
  • Start date Start date
I don't think so. The initial rationale for introducing
What is your rationale for introducing reference counting then? More
specifically: Why is reference-counted *memory* management better than
garbage collection?

I didn't talk about memory management, I talk about freeing native resources
such as handles. With mark&sweep you can run into problems because it takes
time until gc kicks in and frees it. But native resource handles may be very
limited so you can quickly run out of them.

With refcounting, you can guarantee that dispose() is called immediately
when the refcount is zero. Note that disposing an object does not mean
freeing the object itself. The managed object itself can then freed later by
mark & sweep.
You're not alone with that thinking (I once had similar concerns).
Many people have tried to come up with a better solution but to my
knowledge none has ever come up with something substantial. Have you
followed the link I posted? The guy that worte that article is an MS
person involved in the design of .NET. He explains how he also once
was a strong proponent of refcounting but became more and more
convinced that a GC-only strategy is better the more he explored the
exact semantics of a twin (GC and refcounting) solution.

I understand that mark&sweep is better for freeing normal memory (when there
is enough of it). But you will run into problems when using mark&sweep with
limited resources. Make a loop that opens network connections. The
mark&sweep system cannot react so fast, there will be thousands of open
connections and the mark&sweep still does nothing. It doesn't even have a
clue that the app will run out of handles soon. With memory management, the
gc notices when there is not enough memory and runs automatically.

I will follow the link you posted maybe i get further insight.
Moreover, isn't it interesting that all other GCed languages (Java,
Python, etc.) work just like .NET does? There is always something
similar to the .NET dispose pattern. Scores of very smart people have
designed those languages and it speaks volumes that they haven't come
up with a better solution either.

Maybe they was very smart but nobody is perfect. When they did a perfect job
then we never would need further inventions and improvements. In hundred
years the people would still use .NET which its mark&sweep because it is the
perfect solution which cannot be improved. Years ago when they invented
assembler they would have never even though that is would be possible that
unused memory can automatically freed.
You cannot produce what is classically known as a *memory* leak in a
garbage collected language.

Maybe not memory leak but native resources leak. Aditionally, finalizers are
not even guaranteed to run even when the program exits.

I wish you a happy Nikolaustag (Not sure what its name is in english) :-)
 
Cody,

[snip]
I didn't talk about memory management, I talk about freeing native resources
such as handles.

Ok, you meant resource management.
With mark&sweep you can run into problems because it takes
time until gc kicks in and frees it. But native resource handles may be very
limited so you can quickly run out of them.

Resources limited by other factors than memory are just one problem
and not even the worst because the GC can still clean up for you when
you need very few resources.
The following two problems are much worse:
1. Cleanup affecting multiple objects. For example, a StreamWriter
object internally holds its own buffer and a reference to a Stream
object. If you forget to call Dispose on the StreamWriter object and
it is finalized later there is *no way* it could empty the buffer into
the Stream. This is because the Stream also has a finalizer, which
might have been called before the finalizer of the StreamWriter
object. The effect is that the Stream is closed all right but you will
inevitably lose the data in the StreamWriter buffer. Please note that
garbage collection can *never* save you here. The data is always lost!
2. Non-shareable resources like e.g. FileStream. A FileStream object
holds an *exclusive* lock on the file it is currently writing to. If
you fail to call Dispose here and try to reopen the file a bit later
you are almost guaranteed to be greeted with an exception. This is
because the first FileStream object has most probably not yet been
finalized and therefore never had the chance to close the file.
Please note that GC can almost never save you here because it is
unlikely that it will consistently kick in before you try to reopen
the file.

Now, let's get back to your original proposal: You wanted to use
refcounting for classes that need deterministic clean-up. If
refcounting fails because there are cycles you wanted to let the GC
take care of it.
I think it should be pretty obvious that ***neither refcounting nor
GC*** is capable of cleaning up correctly when you have objects of
type 1 or 2 forming cycles. The programmer needs to say when to clean
up such objects and that's exactly what Dispose() is for.
Yes, there is a way to fix problem 1 but it is not very convincing as
it forces programmers to avoid certain types of cycles and hurts
finalization performance. Since problem no. 2 cannot be fixed anyway
the .NET designers decided to not trade implementation difficulties
and worse performance for only slightly easier cleanup.
With refcounting, you can guarantee that dispose() is called immediately
when the refcount is zero.

Only when there are no cycles.
Note that disposing an object does not mean
freeing the object itself. The managed object itself can then freed later by
mark & sweep.

I am very aware of that.
I understand that mark&sweep is better for freeing normal memory (when there
is enough of it). But you will run into problems when using mark&sweep with
limited resources. Make a loop that opens network connections. The
mark&sweep system cannot react so fast, there will be thousands of open
connections and the mark&sweep still does nothing. It doesn't even have a
clue that the app will run out of handles soon. With memory management, the
gc notices when there is not enough memory and runs automatically.

Again, I'm very aware of that. However, what you say here is not *the*
problem why the Dispose pattern is necessary. For example, you could
imagine a system that has its own GC for every shareable resource type
that is limited by other factors than memory.

[snip]
Maybe they was very smart but nobody is perfect. When they did a perfect job
then we never would need further inventions and improvements. In hundred
years the people would still use .NET which its mark&sweep because it is the
perfect solution which cannot be improved. Years ago when they invented
assembler they would have never even though that is would be possible that
unused memory can automatically freed.

As I tried to explain above, the problem is *systemic* and has nothing
to do with language designers not being smart enough or a GC not being
perfect or technology not being available. There simply is no
practical way that certian types of cleanup could be performed
automatically *and* correctly.
In theory fully automatic cleanup is possible though: You simply let
the GC run whenever a reference goes out of scope. However, it should
be clear that doing so wastes ridiculous amounts of precious cycles on
something humans are so much better.
Aditionally, finalizers are
not even guaranteed to run even when the program exits.

This is only true if your program does funny or very lengthy things
during finalization (e.g. creating objects or writing s..tloads of
data to the disk). The finalizer thread will call all finalizers as
long as the number of finalizable objects does not increase and the
whole finalization does not take longer than somewhere around 40
seconds (I forgot the actual number).
I wish you a happy Nikolaustag (Not sure what its name is in english) :-)

Thanks, I wish you the same. I know I'm late ;-). BTW, I'm actually
Swiss, so there's no need to translate...

Regards,

Andreas
 
Andreas, (without thoroughly examining the idea), wouldnt it be fine to
A) create a refcounted type
B) refcounted type may NOT directly or indirectly refer to another
refcounted type, verifiable via compiler and runtime (when using generic
object references, or not all reftypes to hold object references at all...)

I believe that could work... and prevents the circular reference that
plagued rich COM object models...

(And BTW, how did MS ever prevent memory leaks in the MSXML parsing object
model??? I just gotta know!)


--
Eric Newton
C#/ASP Application Developer
http://ensoft-software.com/
(e-mail address removed)-software.com [remove the first "CC."]

Andreas Huber said:
Cody,

[snip]
the
case

I didn't talk about memory management, I talk about freeing native resources
such as handles.

Ok, you meant resource management.
With mark&sweep you can run into problems because it takes
time until gc kicks in and frees it. But native resource handles may be very
limited so you can quickly run out of them.

Resources limited by other factors than memory are just one problem
and not even the worst because the GC can still clean up for you when
you need very few resources.
The following two problems are much worse:
1. Cleanup affecting multiple objects. For example, a StreamWriter
object internally holds its own buffer and a reference to a Stream
object. If you forget to call Dispose on the StreamWriter object and
it is finalized later there is *no way* it could empty the buffer into
the Stream. This is because the Stream also has a finalizer, which
might have been called before the finalizer of the StreamWriter
object. The effect is that the Stream is closed all right but you will
inevitably lose the data in the StreamWriter buffer. Please note that
garbage collection can *never* save you here. The data is always lost!
2. Non-shareable resources like e.g. FileStream. A FileStream object
holds an *exclusive* lock on the file it is currently writing to. If
you fail to call Dispose here and try to reopen the file a bit later
you are almost guaranteed to be greeted with an exception. This is
because the first FileStream object has most probably not yet been
finalized and therefore never had the chance to close the file.
Please note that GC can almost never save you here because it is
unlikely that it will consistently kick in before you try to reopen
the file.

Now, let's get back to your original proposal: You wanted to use
refcounting for classes that need deterministic clean-up. If
refcounting fails because there are cycles you wanted to let the GC
take care of it.
I think it should be pretty obvious that ***neither refcounting nor
GC*** is capable of cleaning up correctly when you have objects of
type 1 or 2 forming cycles. The programmer needs to say when to clean
up such objects and that's exactly what Dispose() is for.
Yes, there is a way to fix problem 1 but it is not very convincing as
it forces programmers to avoid certain types of cycles and hurts
finalization performance. Since problem no. 2 cannot be fixed anyway
the .NET designers decided to not trade implementation difficulties
and worse performance for only slightly easier cleanup.
With refcounting, you can guarantee that dispose() is called immediately
when the refcount is zero.

Only when there are no cycles.
Note that disposing an object does not mean
freeing the object itself. The managed object itself can then freed later by
mark & sweep.

I am very aware of that.
I understand that mark&sweep is better for freeing normal memory (when there
is enough of it). But you will run into problems when using mark&sweep with
limited resources. Make a loop that opens network connections. The
mark&sweep system cannot react so fast, there will be thousands of open
connections and the mark&sweep still does nothing. It doesn't even have a
clue that the app will run out of handles soon. With memory management, the
gc notices when there is not enough memory and runs automatically.

Again, I'm very aware of that. However, what you say here is not *the*
problem why the Dispose pattern is necessary. For example, you could
imagine a system that has its own GC for every shareable resource type
that is limited by other factors than memory.

[snip]
Maybe they was very smart but nobody is perfect. When they did a perfect job
then we never would need further inventions and improvements. In hundred
years the people would still use .NET which its mark&sweep because it is the
perfect solution which cannot be improved. Years ago when they invented
assembler they would have never even though that is would be possible that
unused memory can automatically freed.

As I tried to explain above, the problem is *systemic* and has nothing
to do with language designers not being smart enough or a GC not being
perfect or technology not being available. There simply is no
practical way that certian types of cleanup could be performed
automatically *and* correctly.
In theory fully automatic cleanup is possible though: You simply let
the GC run whenever a reference goes out of scope. However, it should
be clear that doing so wastes ridiculous amounts of precious cycles on
something humans are so much better.
Aditionally, finalizers are
not even guaranteed to run even when the program exits.

This is only true if your program does funny or very lengthy things
during finalization (e.g. creating objects or writing s..tloads of
data to the disk). The finalizer thread will call all finalizers as
long as the number of finalizable objects does not increase and the
whole finalization does not take longer than somewhere around 40
seconds (I forgot the actual number).
I wish you a happy Nikolaustag (Not sure what its name is in english)
:-)

Thanks, I wish you the same. I know I'm late ;-). BTW, I'm actually
Swiss, so there's no need to translate...

Regards,

Andreas
 
Eric,
Andreas, (without thoroughly examining the idea), wouldnt it be fine to
A) create a refcounted type
B) refcounted type may NOT directly or indirectly refer to another
refcounted type, verifiable via compiler and runtime (when using generic
object references, or not all reftypes to hold object references at all...)

I believe that could work... and prevents the circular reference that
plagued rich COM object models...

Yes it would work fine, given that such a type couldn't contain
references to GCed objects and GCed objects couldn't contain
references to such refcounted types. However, how useful would such a
type be?
(And BTW, how did MS ever prevent memory leaks in the MSXML parsing object
model??? I just gotta know!)

Sorry, I don't understand. What property of the MSXML lib are you
referring to?

Regards,

Andreas

P.S. I won't be able to answer further post before Jan 5th...
 
comments inline:

Andreas Huber said:
Eric,


Yes it would work fine, given that such a type couldn't contain
references to GCed objects and GCed objects couldn't contain
references to such refcounted types. However, how useful would such a
type be?


Sorry, I don't understand. What property of the MSXML lib are you
referring to?

well i was actually referring to the whole library, since the MSXML nodes
all point to a DOMDocument and the DOMDocument points to all the Nodes
through a collection, (a circuclar reference) how on earth did this not leak
memory when you set DOMDocument instance to nothing without some kind of
Release or Dispose or whatever...?
Regards,

Andreas

P.S. I won't be able to answer further post before Jan 5th...

and by the way, Chris B on the CLR team has introduced the HandleCollector,
basically a referencing counting mechanism that will collect on a new
instantiation of a finite resource, instead of on release... fascinating how
if you look at a problem from a different perspective you can achieve the
same result (in a sense...)
 
Eric,
well i was actually referring to the whole library, since the MSXML
nodes all point to a DOMDocument and the DOMDocument points to all
the Nodes through a collection, (a circuclar reference) how on earth
did this not leak memory when you set DOMDocument instance to nothing
without some kind of Release or Dispose or whatever...?

I don't know the library at all. The main strategy to prevent
circular-reference-memory-leaks is to use weak pointers:

http://www.boost.org/libs/smart_ptr/weak_ptr.htm

So, while a DOMDocument would hold ordinary refcounted pointers to its
nodes, the nodes themselves would internally hold only weak pointers to
their DOMDocument. I know that COM does not have the weak pointer concept
but you can just as well use plain C++ pointers instead (i.e. you don't call
AddRef). The node member function returning the DOMDocument pointer simply
calls AddRef on the DOMDocument object...

[snip]
and by the way, Chris B on the CLR team has introduced the
HandleCollector, basically a referencing counting mechanism that will
collect on a new instantiation of a finite resource, instead of on
release... fascinating how if you look at a problem from a different
perspective you can achieve the same result (in a sense...)

I guess I fail to get your point. How would such a HandleCollector be able
to provide *deterministic* clean-up? In other words, HandleCollector solves
the resources-limited-by-other-factors-than-memory problem but it does not
solve the other clean-up problems that I described in an earlier post...

Regards,

Andreas
 
Back
Top