Avoiding Default Parameter Checking in C# 4.0

  • Thread starter Thread starter jehugaleahsa
  • Start date Start date
J

jehugaleahsa

Hello:

I am really excited about the default parameters in C# 4.0. However, I
am wondering if there will be a mechanism for avoiding checks against
"default" values. Since I am providing the default, I know it will
definitely work. Is there a way to skip over any argument checks on
that parameter?

Thanks,
Travis
 
I am really excited about the default parameters in C# 4.0.

Unless you are using Office or COM interop, I'm not sure you should be.
However, I
am wondering if there will be a mechanism for avoiding checks against
"default" values. Since I am providing the default, I know it will
definitely work. Is there a way to skip over any argument checks on
that parameter?

Sure: the old fashioned way.

AFAIK, there's no implicit support for detecting default-using call
sites of your method. After the code's been compiled, it looks just
like a call where the caller provided the default value instead.

But, before C# 4.0, the usual approach to default parameters would be to
provide overloads of the method where the arguments omitted from that
overload were passed to a different overload with more arguments, with
the default values provided. If you continue to handle default
arguments that way, then your "different overload" can include some
information (i.e. in the form of the arguments) so that checking of
arguments that had not been provided by the caller can be omitted.

That said, your argument validation would have to be pretty expensive
before it could be justified to bother skipping the validation. Simply
checking to see what arguments were actually provided is in and of
itself overhead, and would often be similar in cost to the validation
itself (e.g. range-checking, checking for null arguments, etc. not to
mention the extra maintenance overhead in programmer time of the
additional logic).

It seems to me that in 99.94% of the code, you are better off just doing
validation on all the arguments, regardless of whether you can and/or do
know that a default value was used. And in that last 0.06%, there's
probably something broken about the validation that is causing the
analysis to come out the other way. :)

Pete
 
Unless you are using Office or COM interop, I'm not sure you should be.


Sure: the old fashioned way.

AFAIK, there's no implicit support for detecting default-using call
sites of your method.  After the code's been compiled, it looks just
like a call where the caller provided the default value instead.

But, before C# 4.0, the usual approach to default parameters would be to
provide overloads of the method where the arguments omitted from that
overload were passed to a different overload with more arguments, with
the default values provided.  If you continue to handle default
arguments that way, then your "different overload" can include some
information (i.e. in the form of the arguments) so that checking of
arguments that had not been provided by the caller can be omitted.

That said, your argument validation would have to be pretty expensive
before it could be justified to bother skipping the validation.  Simply
checking to see what arguments were actually provided is in and of
itself overhead, and would often be similar in cost to the validation
itself (e.g. range-checking, checking for null arguments, etc. not to
mention the extra maintenance overhead in programmer time of the
additional logic).

It seems to me that in 99.94% of the code, you are better off just doing
validation on all the arguments, regardless of whether you can and/or do
know that a default value was used.  And in that last 0.06%, there's
probably something broken about the validation that is causing the
analysis to come out the other way.  :)

Pete

Thanks, Pete. I figured as much, but thought I would ask anyway. I am
really looking forward to default parameters for use in my open source
project Compass. Right now, in 3.5, I spend close to an hour just
definining all of my overloads. One of my policies has been to never
call one public method from another. However, I am planning on using
the default parameters provided by 4.0, so I might just start having
one method call another other. It would definitely cut down on clutter
and duplicate argument checking. I'm so glad it's just an
implementation detail.
 
Unless you are using Office or COM interop, I'm not sure you should be.


Sure: the old fashioned way.

AFAIK, there's no implicit support for detecting default-using call
sites of your method.  After the code's been compiled, it looks just
like a call where the caller provided the default value instead.

But, before C# 4.0, the usual approach to default parameters would be to
provide overloads of the method where the arguments omitted from that
overload were passed to a different overload with more arguments, with
the default values provided.  If you continue to handle default
arguments that way, then your "different overload" can include some
information (i.e. in the form of the arguments) so that checking of
arguments that had not been provided by the caller can be omitted.

That said, your argument validation would have to be pretty expensive
before it could be justified to bother skipping the validation.  Simply
checking to see what arguments were actually provided is in and of
itself overhead, and would often be similar in cost to the validation
itself (e.g. range-checking, checking for null arguments, etc. not to
mention the extra maintenance overhead in programmer time of the
additional logic).

It seems to me that in 99.94% of the code, you are better off just doing
validation on all the arguments, regardless of whether you can and/or do
know that a default value was used.  And in that last 0.06%, there's
probably something broken about the validation that is causing the
analysis to come out the other way.  :)

Pete

I had another question: can one default argument depend on another?

Say I have a method that looks like this:

int Maximum<T>(IList<T> list, int index = 0, int count = list.Count,
Func<T, T, int> comparison = Comparer<T>.Default.Compare);

Will I be allowed to say "int count = list.Count"? Just curious. Is
there a really good site to find answers to more questions I might
have?
 
Thanks, Pete. I figured as much, but thought I would ask anyway. I am
really looking forward to default parameters for use in my open source
project Compass. Right now, in 3.5, I spend close to an hour just
definining all of my overloads.

Just keep in mind that there's a significant difference between
overloads and default parameters: the default parameters get determined
when the _client_ code is compiled, whereas default values provided by
overloads get determined when the library code is compiled.

This is either a good thing or a bad thing, depending on circumstances.
But it's critical you keep it in mind. If you use default parameters,
and then decide later that the default needed to be something else, that
won't be possible. On the other hand, if you use overloads and then
decide later that the default needs to be something else, that will be
possible, but making that change introduces all sorts of other potential
maintenance "gotchas".

But the worst situation to be in is to change the default using default
parameters. Then, you wind up with old code doing one thing, and new
code doing a different thing, depending on what version of your own
library it was compiled against.
One of my policies has been to never
call one public method from another.

All due respect, I find that to be a silly policy. I can see no real
value in it, and there are lots of times it makes sense for one public
method to call another.
However, I am planning on using
the default parameters provided by 4.0, so I might just start having
one method call another other. It would definitely cut down on clutter
and duplicate argument checking. I'm so glad it's just an
implementation detail.

I'm not sure I understand. Wouldn't the point of using default
parameters be so that you _don't_ have to have one method overload call
another? If it's your policy to not have a public method call another
public method, and you are planning to use default parameters, why would
you start violating your policy now, just when the language is about to
offer you a way out?

I must be missing something here, but I don't know what it is.

Pete
 
I had another question: can one default argument depend on another?

Say I have a method that looks like this:

int Maximum<T>(IList<T> list, int index = 0, int count = list.Count,
Func<T, T, int> comparison = Comparer<T>.Default.Compare);

Will I be allowed to say "int count = list.Count"? Just curious. Is
there a really good site to find answers to more questions I might
have?

I don't know about a good reference. Sam Ng, on the language team, has
a blog where he's covered some of the default parameter stuff. I
suppose that might be a good starting point:
http://blogs.msdn.com/samng/default.aspx

As far as your specific question goes, no...the default has to be
determined at compile time. That's the point of my previous comments
about how using default parameters locks your clients into a particular
default when they compile their code.

In case you weren't aware, VS2010 is in open beta now. You can download
it from Microsoft and check out the new C# 4.0 features first-hand, if
you're curious. It should be able to run side-by-side with VS2008, if I
recall correctly.

Pete
 
One of my policies has been to never call one public method from another.

Out of curiosity, why not? If everything else can call one of your
class's methods, what benefit comes from forbidding your own class's
other methods the same access?
 
Out of curiosity, why not? If everything else can call one of your
class's methods, what benefit comes from forbidding your own class's
other methods the same access?

I can call a public method in one class from another. The policy is/
was specific to in-class method calls. I typically reserve public
methods for argument checking, exception handling and resource
management. Then they just call a private method. The private method
stays really pure and simple, which makes it extremely readable.
Occasionally an exception needs thrown in the middle of the private
implementation, but this is a rare case.

The original reason, in the case of my Compass library, was because I
didn't want to hurt performance performing unnecessary argument
checks. For instance, I have a ToIList method which converts an
IEnumerable<T> to an IList<T> (using "as" to see whether it is
already one first). I do a check to see whether the IEnumerable<T> is
null and whether it is an instance of IList<T>. If I am calling it
from a method where I know for sure that the IEnumerable<T> is not
null and that it is not IList<T>, why pay for the runtime overhead?

There are also occasions where I want a method to be available for
both .NET 1 and beyond. So, if I am working with an IEnumerable<T> and
an IEnumerable, I will create a private method:

int count(IEnumerator enumerator)
{
int count = 0;
while (enumerator.MoveNext())
{
++count;
}
return count;
}

I call this method with a generic and non-generic version as so:

int Count<T>(IEnumerable<T> collection)
{ // argument check
using (IEnumerator<T> enumerator = collection.GetEnumerator())
{ return count(enumerator); }
}

int Count(IEnumerable collection)
{ // argument check
return count(collection.GetEnumerator());
}

Since IEnumerator<T> is disposable, I manage the resource in the
public method. The private method doesn't care and remains simple and
straightforward.

A lot of it, though, was just my C++ super-performance intensive
mindset coming back. However, this approach works pretty well in full-
blown applications fairly well, too. Mostly it just helps me separate
the business logic from the "programming" details.
 
I can call a public method in one class from another. The policy is/
was specific to in-class method calls.

I think we all understood that much.
I typically reserve public
methods for argument checking, exception handling and resource
management.

So, none of your in-class code gets these benefits? Seems like that
would only increase overhead elsewhere, either in maintenance or
duplicated code.
Then they just call a private method. The private method
stays really pure and simple, which makes it extremely readable.
Occasionally an exception needs thrown in the middle of the private
implementation, but this is a rare case.

The original reason, in the case of my Compass library, was because I
didn't want to hurt performance performing unnecessary argument
checks.

I find this comment ironic. It seems to me that the rule to never call
public methods from within the class leads to extra method calls, itself
a potential "performance hurting" operation. Performance-wise, it seems
that if one has to choose between checking for a null pointer and making
two method calls where one would suffice, the former would be
preferable. (Granted, that may not be the specific pair of choices one
is presented with...the example is just for the point of comparison).
For instance, I have a ToIList method which converts an
IEnumerable<T> to an IList<T> (using "as" to see whether it is
already one first). I do a check to see whether the IEnumerable<T> is
null and whether it is an instance of IList<T>.

I'm confused by your description of this method. It sounds like you
have a method that does basically what the "as" operator does, except
that it somehow fails (throws an exception?) if you pass it "null"? And
presumably it also fails somehow if the object doesn't implement
IList<T> (also an exception?).

What is this method doing, other than creating a situation where you
could have a variable typed as IList<T> and fail to convert the value of
that variable to an IList<T>? For example:

IList<T> ToIList<T>(IEnumerable<T> e)
{
IList<T> l = e as IList<T>;

if (e == null || l == null)
throw new InvalidArgumentException();

return l;
}

void MethodA()
{
MethodB(null);
}

void MethodB<T>(IList<T> l)
{
MethodC(l);
}

void MethodC(IEnumerable<T> e)
{
IList<T> l = ToIList(e);
}

The example is only slightly contrived, in that I've explicitly created
a sequence of calls guaranteed to fail. Real-life code could
legitimately have a null reference in a variable rather than a
hard-coded value.

In any case...
If I am calling it
from a method where I know for sure that the IEnumerable<T> is not
null and that it is not IList<T>, why pay for the runtime overhead?

If you already know it's NOT an IList<T>, why would you call such a
method at all, regardless of whether it has argument checking or not?

In any case, the more direct answer to your question is: because it
complicates the code, and creates runtime overhead for other scenarios
(an extra method call for any callers of the public, argument-checking
version).

Basically, your in-class code avoids a little bit of overhead, but it
forces your client code to suffer _two_ forms of overhead instead of
just one.
There are also occasions where I want a method to be available for
both .NET 1 and beyond. So, if I am working with an IEnumerable<T> and
an IEnumerable, I will create a private method:

int count(IEnumerator enumerator)
{
int count = 0;
while (enumerator.MoveNext())
{
++count;
}
return count;
}

I call this method with a generic and non-generic version as so:

int Count<T>(IEnumerable<T> collection)
{ // argument check
using (IEnumerator<T> enumerator = collection.GetEnumerator())
{ return count(enumerator); }
}

int Count(IEnumerable collection)
{ // argument check
return count(collection.GetEnumerator());
}

First, there's already an IEnumerable<T>.Count() extension method. Why
write your own? (Especially since the extension method is clever enough
to not actually enumerate the entire collection if there's a Count
property it can retrieve).

Second, IEnumerable<T> implements IEnumerable. So you don't need a
third method for the purpose. You can just have your Count<T>() method
call your Count() method and put the implementation there:

int Count<T>(IEnumerable<T> collection)
{
return Count((IEnumerable)collection);
}

int Count(IEnumerable collection)
{
int iRet = 0;

foreach (object obj in collection)
{
iRet++;
}

return iRet;
}

Note that there's also no need to get the enumerator itself. Let the
language take care of that for you with the "foreach" statement.

In any case, I don't see how that example explains a policy of never
calling a class's public methods from within the class.

Pete
 
I think we all understood that much.


So, none of your in-class code gets these benefits?  Seems like that
would only increase overhead elsewhere, either in maintenance or
duplicated code.



I find this comment ironic.  It seems to me that the rule to never call
public methods from within the class leads to extra method calls, itself
a potential "performance hurting" operation.  Performance-wise, it seems
that if one has to choose between checking for a null pointer and making
two method calls where one would suffice, the former would be
preferable.  (Granted, that may not be the specific pair of choices one
is presented with...the example is just for the point of comparison).


I'm confused by your description of this method.  It sounds like you
have a method that does basically what the "as" operator does, except
that it somehow fails (throws an exception?) if you pass it "null"?  And
presumably it also fails somehow if the object doesn't implement
IList<T> (also an exception?).

What is this method doing, other than creating a situation where you
could have a variable typed as IList<T> and fail to convert the value of
that variable to an IList<T>?  For example:

   IList<T> ToIList<T>(IEnumerable<T> e)
   {
     IList<T> l = e as IList<T>;

     if (e == null || l == null)
       throw new InvalidArgumentException();

     return l;
   }

   void MethodA()
   {
     MethodB(null);
   }

   void MethodB<T>(IList<T> l)
   {
     MethodC(l);
   }

   void MethodC(IEnumerable<T> e)
   {
     IList<T> l = ToIList(e);
   }

The example is only slightly contrived, in that I've explicitly created
a sequence of calls guaranteed to fail.  Real-life code could
legitimately have a null reference in a variable rather than a
hard-coded value.

In any case...


If you already know it's NOT an IList<T>, why would you call such a
method at all, regardless of whether it has argument checking or not?

In any case, the more direct answer to your question is: because it
complicates the code, and creates runtime overhead for other scenarios
(an extra method call for any callers of the public, argument-checking
version).

Basically, your in-class code avoids a little bit of overhead, but it
forces your client code to suffer _two_ forms of overhead instead of
just one.










First, there's already an IEnumerable<T>.Count() extension method.  Why
write your own?  (Especially since the extension method is clever enough
to not actually enumerate the entire collection if there's a Count
property it can retrieve).

Second, IEnumerable<T> implements IEnumerable.  So you don't need a
third method for the purpose.  You can just have your Count<T>() method
call your Count() method and put the implementation there:

   int Count<T>(IEnumerable<T> collection)
   {
     return Count((IEnumerable)collection);
   }

   int Count(IEnumerable collection)
   {
     int iRet = 0;

     foreach (object obj in collection)
     {
       iRet++;
     }

     return iRet;
   }

Note that there's also no need to get the enumerator itself.  Let the
language take care of that for you with the "foreach" statement.

In any case, I don't see how that example explains a policy of never
calling a class's public methods from within the class.

Pete- Hide quoted text -

- Show quoted text -

The count method was just an example. To explain the ToIList method,
it acts a lot like the ToList method. If it is not an IList, it simply
creates one from the argument.

Also, it is better to enumerate the IEnumerator instead of using a
"foreach" because there is no need to inspect the values of the
collection. If you were working on a value type, such as DateTime or
Decimal, you would be copying a lot of data unnecessarily. It's just a
minor performance enhancement. You also have to wonder whether C# uses
a using block when looping over an non-generic IEnumerable. It is
guaranteed for IEnumerable<T>, but how could it know prior to runtime
whether to do resource management or not? My guess is that it foregoes
using a using, and this is not the expected behavior (although I have
only seen one example where IEnumerator<T> actually needed to be
IDisposable).

The overhead of calling private methods isn't that significant. You're
probably right that this costs more than a couple argument checks
(except in the case of reflection). It is not always justifiable to
create a whole new method just to put your implementation details. I
rarely have classes that call their own methods. However, when I do, I
usually extract the logic to avoid unnecessary argument checking, etc.
I have found that if I am extremely vigilant, some methods can become
static, instead of instance methods. If enough of these appear in the
class, and they are related, I usually create a "helper" class. It is
a great way to discover unexpected code reuse.

Another thing to remember is that the logic within a method can be as
simple as a if/else statement or extremely complex. The more time
consuming the method, the less and less significant a little method
call overhead becomes. When calling between members, the call is going
to take place no matter what; it is just a matter of which methods
call which.

In my experience, calling public methods from others has caused me
problems. I lose track of which methods do what. I know this may sound
horrible, but my public interfaces change quite often during
development. Refactoring and all... Strangely, my private
implementations rarely change, just how and when they are called. This
seems a bit backward, but it's the truth. I mean, well factored logic
isn't going to change often; just how and when it is called.

Nothing in our profession should always be done one way; there's
always a special case. For this policy, there are probably about a
hundred. I take a lot of care when breaking apart my code, so I'm not
100% sold on the idea either. Personally, though, it has really paid
off and seems to scale well.
 
The count method was just an example. To explain the ToIList method,
it acts a lot like the ToList method. If it is not an IList, it simply
creates one from the argument.

That doesn't change the point I made. Specifically, there's no obvious
reason that a) a null input should be an error, and b) none of your
explanation of the ToIList() method suggests a need to factor it into
two pieces, one for argument checking and one for implementation details.
Also, it is better to enumerate the IEnumerator instead of using a
"foreach" because there is no need to inspect the values of the
collection.

The enumerator copies them somewhere, whether you look at them or not.
And the JITter can easily refer directly to that "somewhere" (i.e. the
IEnumerator.Current property) rather than actually copying the value
again to a new variable.
If you were working on a value type, such as DateTime or
Decimal, you would be copying a lot of data unnecessarily.

How do you know that? Have you actually proven that in production code,
using IEnumerator directly to count items in an enumeration produces a
significant performance improvement over the simpler, more maintainable
approach?
It's just a minor performance enhancement.

I remain unconvinced that it's a noticeable performance enhancement at
all. I'll bet that even in a trivial benchmark that only enumerates,
there'd be no practical difference between the two, and certainly in
code that does something interesting with the enumeration, any
difference that may exist (if any) would be inconsequential.
You also have to wonder whether C# uses
a using block when looping over an non-generic IEnumerable. It is
guaranteed for IEnumerable<T>, but how could it know prior to runtime
whether to do resource management or not?

I don't know whether it does or not either. I haven't bothered to look.
The fact is, it would be trivial at run-time to do an "as IDisposable"
check and dispose if necessary. In fact, I'm pretty sure this is what
happens if the foreach is on an IEnumerable instead of IEnumerable<T>.

On the other hand, the IEnumerator interface itself doesn't inherit
IDisposable, and so other code (i.e. not auto-generated by the language)
still has the same problem if someone winds up using the IEnumerable
version.
My guess is that it foregoes
using a using, and this is not the expected behavior (although I have
only seen one example where IEnumerator<T> actually needed to be
IDisposable).

It should be rare. As far as foregoing "using" goes, that's a
non-issue. The "using" statement is for people writing C#. The C#
compiler isn't writing C#; it's just emitting MSIL. So it can just put
the same try/finally pattern in place that it uses for "using", with the
small modification of checking for the IDisposable implementation before
trying to dispose.
The overhead of calling private methods isn't that significant.

No, I agree...it's not. But neither is argument checking.
You're
probably right that this costs more than a couple argument checks
(except in the case of reflection). It is not always justifiable to
create a whole new method just to put your implementation details.

I'd say it's almost never justifiable to create a whole new method just
for implementation details. The first method is (should be)
"implementation details" itself anyway.
I rarely have classes that call their own methods.

I agree that it's not very common for a class to have to call its own
public methods. But, that's not the point. The point is whether it
makes sense to have a policy to specifically _avoid_ calling one's own
public methods.

Just because it doesn't happen very often, that's no reason that if it
does come up, one should try to avoid letting it happen.
However, when I do, I
usually extract the logic to avoid unnecessary argument checking, etc.

Yes, again...we understand that. That's where we started this
discussion. :)
I have found that if I am extremely vigilant, some methods can become
static, instead of instance methods.

Irrelevant to the question of refactoring to separate argument checking
from implementation.
If enough of these appear in the
class, and they are related, I usually create a "helper" class. It is
a great way to discover unexpected code reuse.

See above (i.e. irrelevant).
Another thing to remember is that the logic within a method can be as
simple as a if/else statement or extremely complex. The more time
consuming the method, the less and less significant a little method
call overhead becomes.

Likewise argument checking. I.e. not a factor that argues in favor of
one approach or the other.
When calling between members, the call is going
to take place no matter what; it is just a matter of which methods
call which.

Likewise client code calling public members. Again, not a factor in the
discussion.
In my experience, calling public methods from others has caused me
problems.

In my experience, bypassing argument checking in arbitrarily chosen
situations has caused problems.
I lose track of which methods do what. I know this may sound
horrible, but my public interfaces change quite often during
development. Refactoring and all... Strangely, my private
implementations rarely change, just how and when they are called. This
seems a bit backward, but it's the truth. I mean, well factored logic
isn't going to change often; just how and when it is called.

That all makes sense for private implementation that isn't just a repeat
of the public API. But obviously what you wrote does not apply to
private implementation methods that are simply called directly by a
method that has as its only purpose argument checking.
Nothing in our profession should always be done one way; there's
always a special case. For this policy, there are probably about a
hundred. I take a lot of care when breaking apart my code, so I'm not
100% sold on the idea either. Personally, though, it has really paid
off and seems to scale well.

Okay. I remain unconvinced, but it's your code. :)

Pete
 
That doesn't change the point I made.  Specifically, there's no obvious
reason that a) a null input should be an error, and b) none of your
explanation of the ToIList() method suggests a need to factor it into
two pieces, one for argument checking and one for implementation details.

Blah.ToIList(list) should work the same as list.ToIList(). Agree or
disagree?
The enumerator copies them somewhere, whether you look at them or not.
And the JITter can easily refer directly to that "somewhere" (i.e. the
IEnumerator.Current property) rather than actually copying the value
again to a new variable.


How do you know that?  Have you actually proven that in production code,
using IEnumerator directly to count items in an enumeration produces a
significant performance improvement over the simpler, more maintainable
approach?


I remain unconvinced that it's a noticeable performance enhancement at
all.  I'll bet that even in a trivial benchmark that only enumerates,
there'd be no practical difference between the two, and certainly in
code that does something interesting with the enumeration, any
difference that may exist (if any) would be inconsequential.

Try it. I bet it doesn't do much for performance either. However, why
bring a variable into existance if you're not going to use it.
I don't know whether it does or not either.  I haven't bothered to look..
  The fact is, it would be trivial at run-time to do an "as IDisposable"
check and dispose if necessary.  In fact, I'm pretty sure this is what
happens if the foreach is on an IEnumerable instead of IEnumerable<T>.

I'd like to experiment with this one also. But why risk it?
On the other hand, the IEnumerator interface itself doesn't inherit
IDisposable, and so other code (i.e. not auto-generated by the language)
still has the same problem if someone winds up using the IEnumerable
version.

That's probably why there aren't many implementations out there that
utilize this feature. Personally, I think it just makes things more
complicated. It is like providing a solution to a problem that only 2%
of the community cares about. It might be cool if someone create a
StreamEnumerator that took a file name and returned each line as an
item. using in the situation could be nice to close the stream.
Perhaps something similar with a database query? However, I think a
more explicit approach would be better.
It should be rare.  As far as foregoing "using" goes, that's a
non-issue.  The "using" statement is for people writing C#.  The C#
compiler isn't writing C#; it's just emitting MSIL.  So it can just put
the same try/finally pattern in place that it uses for "using", with the
small modification of checking for the IDisposable implementation before
trying to dispose.

Yes, of course. I am just unsure that the compiler would even bother
checking for IDisposable. I don't know the runtime impacts of dynamic
type checking, but I'm betting MS wouldn't do it for ever foreach
loop. It would just cost too much. Again, why risk it? Just write your
code under that assumption.

On a similar note, ADO.NET IDbCommand objects are supposedly
IDisposable as well. However, aside from myself, I don't know any
programmers who bother to surround a command with a using block. They
do it for the connection and IDataReaders, but not IDbCommands. It's
reasonable to ask, "What could there be needing disposed?" I can't
think of anything! Nonetheless, I use a using anyway. Just about the
only thing I don't use a using for are Forms.
No, I agree...it's not.  But neither is argument checking.

I agree. So why pay for a method call when you can avoid it? So why
pay for an argument check when you can avoid it? I suppose it matters
which one you are going to do more often. If you are calling a method
from outside the class, it's probably is faster to not have a private
method. If you are going to be calling a lot from within the class,
it's probably faster to call an unchecked implementation.
I'd say it's almost never justifiable to create a whole new method just
for implementation details.  The first method is (should be)
"implementation details" itself anyway.

See, I think there is a big difference between code that does
exception handling, resource management and argument checking and the
actual business logic. The core code should be oblivious to such
considerations and operate in a "care-free" manner. All that other
stuff is just necessary fluff as far as I am concerned. It helps other
programmers filter out what is and is not important. So what if it
requires an extra method call?
I agree that it's not very common for a class to have to call its own
public methods.  But, that's not the point.  The point is whether it
makes sense to have a policy to specifically _avoid_ calling one's own
public methods.

Just because it doesn't happen very often, that's no reason that if it
does come up, one should try to avoid letting it happen.

My experience says otherwise.
Yes, again...we understand that.  That's where we started this
discussion.  :)


Irrelevant to the question of refactoring to separate argument checking
from implementation.

Sometimes the only way you find a better solution is to make
opportunities for them during development. In that sense, it is very
relevant. I have wound up with entire libraries that initial started
out as private static methods. I wouldn't have even noticed it had I
always written my logic in-line.
See above (i.e. irrelevant).


Likewise argument checking.  I.e. not a factor that argues in favor of
one approach or the other.

I'd say it is less in favor of my approach.
Likewise client code calling public members.  Again, not a factor in the
discussion.

Personally, I like to think of the internals of a class as a free-for-
all. Once inside, there are no restrictions. I trust that my class is
intelligent enough to know how to work on itself. Personally, I like
to think of ArgumentException and its derivitives as uncatchable
exceptions. It's not the user's fault or bad data; it bad programming.
Somehow I let something through that I shouldn't have. Perhaps I need
better UI validation or perhaps I just screwed up. In opposition to my
policy, immediately detecting bad arguments would make it easier to
resolve such bugs. However, all that extra code just clouds up the
significant logic. I say buy yourself the argument checking on the
public methods but give yourself the clarity of care-free private
methods. If your class is well-designed, it will be easy to work out
any obvious internal bugs with unit tests. At that point the argument
checks are really for later when you need to use your class in
collaboration with others.
In my experience, bypassing argument checking in arbitrarily chosen
situations has caused problems.

It's not arbitrary at all. If you want to reduce the number of these
private, care-free methods floating around, make your classes smaller.
90% of the time it is just a public method calling its private method.
There isn't much calling of one private to another.
That all makes sense for private implementation that isn't just a repeat
of the public API.  But obviously what you wrote does not apply to
private implementation methods that are simply called directly by a
method that has as its only purpose argument checking.

Here is an example. Suppose you write your class to originally support
a range between two numbers. Then 6 months down the road you decide to
support ranges that include a starting number and the number of
numbers (count). You can still use the same private implementation,
just by modifying the input arguments. If you had your logic inline
with your argument checks you could have achieved the same thing.
However, before when you had a start and stop, now you must calculate
the stop. So you have the "count" variable sitting around for the rest
of the method that doesn't do anything else.

pri
Okay.  I remain unconvinced, but it's your code.  :)

Different experiences I suppose. For me it has always seemed to reduce
complexity and promoted small, cohesive classes. The larger the class,
the more likely there will be methods calling methods and argument
checking flies out the window. With a small class, you can clearly see
where the logic is and the flow of execution.
 
I just read over what I am saying and I can see where it fails to
stand.

Argument checks are always really a safety net for the programmer.
They provide documentation in a sense. They don't promise that the
method will do what it says it does, just that you'd better at least
make sure of some things if you expect it to work at all!

I have met a lot of programmers who obsessively use argument checks
down to the lowest levels of their code. In their eyes, there is no
question about it: this method will do argument checking. That is a
"far right" opinion. It definitely makes code "appear" safer. Say you
pass a collection of objects through 3 layers of code. You want to
verify a) that the collection is not null and b) that each of the
objects is not null. Do you loop through all of the objects at each
layer to verify that none are null?

I have also met an even larger number of programmers who don't do
argument checking at all, "far left". Their opinion is that code
should be written to work and it's your fault if it blows up.
Typically, they work in a shop where one mind is in full control of
the code and so, "why would I ever pass a null list? Duh!" These are
the people whose code never seems to work.

I am somewhere in the middle. I like argument checks, especially when
it comes to working with others or making a library (which is my job).
Argument checks are great because they make you think about the state
your program is in when the method is called. "What could break this
code?" However, whenever I start being anal about argument checks I
also start writing less methods. I write less methods because
otherwise I have to write 10 lines of code to check my arguments...
arguments that I KNOW are correct because I JUST CHECKED THEM in the
calling method. Suddenly, I start getting 100 line methods. Suddenly,
I start having a lot more bugs. "If only I had broken this method down
into smaller pieces!" You might say that the smaller pieces wouldn't
have prevented the bug... but I guarantee they play a role.

If a method calls another that shares all the same arguments, do we do
another set of argument checks? What if only 1 of them is shared? What
about 2, 3, ...? At what point are you comfortable that your arguments
don't require re-checking? Would it make a difference if the two
methods were in different classes? If you didn't have multiple
methods, just one big method, would you have thought to do the same
checks? multiple times?

Do we limit ourselves to writing methods only when they reduce
duplication? A lot of programmers think methods should be used to
create named operations. A lot of authors seem to think complex
boolean operations are better described this way. So, when is writing
a separate method okay? At what point does trying to keep the logic in
the public method start to cost clarity? At what point does avoiding
argument checking in private methods start to increase the likelihood
of bugs? I say, if you keep your methods small and simple, argument
checks within a class aren't essential. Start adding some complexity
and a saftey net starts to sound good.
 
Blah.ToIList(list) should work the same as list.ToIList(). Agree or
disagree?

What is "list.ToIList()"?

I can neither agree nor disagree if I don't know what you're asking.
[...]
I remain unconvinced that it's a noticeable performance enhancement at
all. I'll bet that even in a trivial benchmark that only enumerates,
there'd be no practical difference between the two, and certainly in
code that does something interesting with the enumeration, any
difference that may exist (if any) would be inconsequential.

Try it. I bet it doesn't do much for performance either. However, why
bring a variable into existance if you're not going to use it.

Because the code is simpler doing so. All else being equal,
less/simpler code is always better.
I'd like to experiment with this one also. But why risk it?

Why risk what?
[...]
It should be rare. As far as foregoing "using" goes, that's a
non-issue. The "using" statement is for people writing C#. The C#
compiler isn't writing C#; it's just emitting MSIL. So it can just put
the same try/finally pattern in place that it uses for "using", with the
small modification of checking for the IDisposable implementation before
trying to dispose.

Yes, of course. I am just unsure that the compiler would even bother
checking for IDisposable.

It had better. It's one thing for your own code to create an
IDisposable and then fail to dispose it. But for the compiler to do so?
That'd clearly be a bug.

Here is the relevant portion of the C# specification:

A foreach statement of the form

foreach (V v in x) embedded-statement

is then expanded to:

{
E e = ((C)(x)).GetEnumerator();
try {
V v;
while (e.MoveNext()) {
v = (V)(T)e.Current;
embedded-statement
}
}
finally {
… // Dispose e
}
}

and:

The body of the finally block is constructed according
to the following steps:

• If there is an implicit conversion from E to the
System.IDisposable interface, then the finally clause
is expanded to the semantic equivalent of:

finally {
((System.IDisposable)e).Dispose();
}

except that if e is a value type, or a type parameter
instantiated to a value type, then the cast of e to
System.IDisposable will not cause boxing to occur.

• Otherwise, if E is a sealed type, the finally clause
is expanded to an empty block:

finally {
}

• Otherwise, the finally clause is expanded to:

finally {
System.IDisposable d = e as System.IDisposable;
if (d != null) d.Dispose();
}

The local variable d is not visible to or accessible
to any user code. In particular, it does not conflict
with any other variable whose scope includes the finally
block.

Which is all a long way of saying, the compiler _always_ disposes
disposable enumerators. This is true even for the IEnumerator interface.
I don't know the runtime impacts of dynamic
type checking, but I'm betting MS wouldn't do it for ever foreach
loop.

You'd lose that bet. It happens all the time. The "foreach" statement
can do all sorts of implicit extra work you might not realize, including
casting, type-checking, and even disposing.
It would just cost too much. Again, why risk it? Just write your
code under that assumption.

Where's the risk?
On a similar note, ADO.NET IDbCommand objects are supposedly
IDisposable as well. However, aside from myself, I don't know any
programmers who bother to surround a command with a using block. They
do it for the connection and IDataReaders, but not IDbCommands. It's
reasonable to ask, "What could there be needing disposed?" I can't
think of anything! Nonetheless, I use a using anyway. Just about the
only thing I don't use a using for are Forms.

If you are showing a Form instance modally, you definitely should be
using a "using" statement or otherwise disposing the instance when
you're done with it.

If you are not showing a Form instance modally, the documentation is
very clear that when the Form instance is closed, it is disposed.

Either way, guidance is clear, and there's no question about what the
right thing to do is.

In any case, this is a complete non-issue with respect to the question
of whether to separate argument checking into a different method from
the implementation, as well as to the question (the ACTUAL one at hand)
of whether it makes sense to have a policy of a class's code never
calling that class's own public methods.
I agree. So why pay for a method call when you can avoid it?
Exactly.

So why pay for an argument check when you can avoid it?

Because a) you can't really avoid it (the code is broken if you do), and
b) because the code is _simpler_ and no worse when you don't avoid it.

In other words, there's a benefit to not avoiding it, and none to
avoiding it.
I suppose it matters
which one you are going to do more often. If you are calling a method
from outside the class, it's probably is faster to not have a private
method. If you are going to be calling a lot from within the class,
it's probably faster to call an unchecked implementation.

I doubt you will find a significant performance difference either way.
The question doesn't depend on the issue of performance. It's a code
correctness and maintenance issue.
See, I think there is a big difference between code that does
exception handling, resource management and argument checking and the
actual business logic. The core code should be oblivious to such
considerations and operate in a "care-free" manner. All that other
stuff is just necessary fluff as far as I am concerned. It helps other
programmers filter out what is and is not important. So what if it
requires an extra method call?

Frankly, I disagree with your desire to separate input checking from
implementation. However, even if you assume that's a reasonable goal,
the fact remains that in no way justifies a _policy_ of never calling
public methods of a class from within the class.
My experience says otherwise.

Your experience says that for X where X doesn't happen very often, that
automatically means that you should try to avoid letting X happen?

That's pretty bizarre logic if you ask me. I can think of lots of very
nice things that don't happen very often, but which I definitely would
not like to avoid letting happen.
[...]
Irrelevant to the question of refactoring to separate argument checking
from implementation.

Sometimes the only way you find a better solution is to make
opportunities for them during development. In that sense, it is very
relevant. I have wound up with entire libraries that initial started
out as private static methods. I wouldn't have even noticed it had I
always written my logic in-line.

Huh? There is NOTHING about the question of separating argument
checking from the rest of the implementation that has ANYTHING to do
with whether a method can be a static method or not.

It's completely irrelevant to this discussion. That's why I wrote this
next line:
I'd say it is less in favor of my approach.

If you like. I don't see how it's different one way or the
other...expensive methods tend to hide implementation details intended
to reduce overhead, regardless of what overhead we're talking about.

But if you want to use that to argue against your own policy, that's
fine with me. :)
Personally, I like to think of the internals of a class as a free-for-
all.

Well, I'll just have to disagree. I find there's a real benefit in
having internals have many of the same encapsulation and data
verification features found elsewhere.

Now, that's not to say every single method goes around verifying its
input. But, a useful goal is to try to be able to prove at some point
in a call chain, based on methods higher in the call chain, that you
know for sure the data is valid. Calling through a public method that
does argument checking is a perfectly reasonable thing to do, where
there's a public method that represents the thing that code within the
class needs to do.
Once inside, there are no restrictions. I trust that my class is
intelligent enough to know how to work on itself. Personally, I like
to think of ArgumentException and its derivitives as uncatchable
exceptions. It's not the user's fault or bad data; it bad programming.

Indeed. In fact, even for public members, it is debatable whether that
sort of argument checking is really required. It depends a lot on
context, and in some contexts, it's sufficient if a bad argument is
passed, even to a public member, to just let that public member fail
when it reaches the bad argument.

It depends on what the actual consumer of the public method is, and the
nature of the argument, among other things. Policies that don't take
those issues into account are flawed policies.
Somehow I let something through that I shouldn't have. Perhaps I need
better UI validation or perhaps I just screwed up. In opposition to my
policy, immediately detecting bad arguments would make it easier to
resolve such bugs. However, all that extra code just clouds up the
significant logic.

Well, what's more important? Ease of maintenance, and correct code? Or
avoiding having to scroll past a few lines of argument checking to see
the main implementation of a method?

In fact, there are patterns people have come up with that help
encapsulate argument checking somewhat, to minimize the prologue code in
a method. And in .NET 4.0, code contracts should help a lot to reduce
the explicit argument checking code that can distract from
implementation details.

So it's not that the issue you raise isn't something other people don't
notice, or something that affects the way people code. It is. But it's
just not significant enough to justify a policy to never call public
methods in a class from within the class.
[...]
In my experience, bypassing argument checking in arbitrarily chosen
situations has caused problems.

It's not arbitrary at all. If you want to reduce the number of these
private, care-free methods floating around, make your classes smaller.

Small classes are a good thing, but making classes smaller just so you
can avoid having a single class call its own public methods doesn't make
any sense. The policy is bad enough when applied to an implementation
detail such as where the implementation of a method actually exists, but
when it starts guiding the actual _design_ of your class hierarchies,
that's really going too far.
[...]
That all makes sense for private implementation that isn't just a repeat
of the public API. But obviously what you wrote does not apply to
private implementation methods that are simply called directly by a
method that has as its only purpose argument checking.

Here is an example. Suppose you write your class to originally support
a range between two numbers. Then 6 months down the road you decide to
support ranges that include a starting number and the number of
numbers (count). You can still use the same private implementation,
just by modifying the input arguments.

Uh, okay...let's see if I have this straight:

-- Your policy is based on the desire to avoid the overhead of an
argument check when calling a method from within the class

-- This is in spite of the overhead created by calling an extra method

-- On top of that, you conclude that there's a benefit to making a
private implementation called by the public method, even though that
means that your public method will have to include even _more_ overhead
in order to translate the given inputs to those used by the private
implementation (e.g. adding a count to the first argument so that you
can pass a start and end number to the private implementation).

As you know, I really think it doesn't make sense to waste time
optimizing code until you know for sure it needs optimizing. It's easy
enough to avoid doing _stupid_ things that will completely break
performance, leaving any remaining potential optimizations having only
relatively small degrees of effect on the overall execution of your code.

So there's certainly a philosophical disconnect between what you say is
the root of your policy, and my own preferences. But, if you're going
to argue in favor of the policy on the basis of minimizing overhead, I
don't see how it helps support your policy when you point out the ways
that your policy leads to a requirement to add overhead.

I mean, there are some things about your statements relative to your
policy that make me wonder which of us is arguing against the policy,
you or me! :)
If you had your logic inline
with your argument checks you could have achieved the same thing.
However, before when you had a start and stop, now you must calculate
the stop. So you have the "count" variable sitting around for the rest
of the method that doesn't do anything else.

How is that worse that having a whole extra method in your call stack,
with its own "start" and "count" arguments, and of course the overhead
of the call itself?

In any case, I would say that in most cases, your implementation would
reflect the inputs. Rather than writing:

void Method(int start, int count)
{
int last = start + count;

for (int i = start; i < last; i++)
{
}
}

I think it more likely you'd have this:

void Method(int start, int count)
{
for (int i = start; i < start + count; i++)
{
}
}

Or even this (slightly less efficient, but even fewer variables and less
code):

void Method(int start, int count)
{
while (count-- > 0)
{
start++;
}
}

There's no extra variable there (even if that were a serious
consideration, which it's not).
Different experiences I suppose. For me it has always seemed to reduce
complexity and promoted small, cohesive classes. [...]

I found the idea that you can reduce complexity and promote a smaller
class by _adding_ methods to the class, um...well, counter-intuitive at
the least. If not outright self-contradictory. :)

Pete
 
Sorry for clearing the board. I agree with everything you are saying.
Pretty much everyone agrees that pre- and post-conditions should be a
part of your system. At some point in my career I guess I just decided
that the public interface would be where I dealt with argument
checking. I mean, no one really tells you where it is appropriate.

I come from a pretty small shop. Most of the programmers here wouldn't
even bother with argument checking at all. They assume they are the
center of the universe and their classes are simple enough to have
straightforward calling semantics. In general, they are right. We are
pretty inconsistent about how we do error handling, but overall we get
the job done.

However, the more I think about it, the more I realize that when
implementing a very complex class, adding checks throughout the
private code would be very useful for detecting bugs early on.

You're absolutely right about the performance thing, too. There is no
point in having extremely high-performance code that doesn't work.
Again, I think my many years of working in C++ have permanently made
me too performance happy. If you read a mainstream C++ book, 90% of
the time the author will be talking about some technique that improves
performance ever so slightly. I wish I could get away from that mind
set.

It doesn't help that I am writing a algorithms library for educational
purposes. The goal of the library is to provide Linq- and STL-like
implementations. I want the code to be straight-forward and correct.
How can I make the code approachable and easily reusable? I don't want
to labor readers with 2 page argument checking. I want to allow them
to go straight to the core code. Today, I was working on an STL-like
merge method. Let me show you what the public method looks like:

public static int Merge<T>(this IList<T> list1,
int index1,
int count1,
IList<T> list2,
int index2,
int count2,
IList<T> destination,
int destinationIndex,
Func<T, T, int> comparison)
{
if (list1 == null)
{
throw new ArgumentNullException("list1",
Resources.NullFirstList);
}
if (index1 < 0 || index1 > list1.Count)
{
throw new ArgumentOutOfRangeException("index1",
index1, Resources.IndexOutOfRange);
}
if (count1 < 0)
{
throw new ArgumentOutOfRangeException("count1",
count1, Resources.NegativeCount);
}
if (list1.Count < index1 + count1)
{
throw new ArgumentOutOfRangeException("count1",
count1, Resources.CountBiggerThanFirstList);
}
if (list2 == null)
{
throw new ArgumentNullException("list2",
Resources.NullSecondList);
}
if (index2 < 0 || index2 > list2.Count)
{
throw new ArgumentOutOfRangeException("index2",
index2, Resources.IndexOutOfRange);
}
if (count2 < 0)
{
throw new ArgumentOutOfRangeException("count2",
count2, Resources.NegativeCount);
}
if (list2.Count < index2 + count2)
{
throw new ArgumentOutOfRangeException("count2",
count2, Resources.CountBiggerThanSecondList);
}
if (destination == null)
{
throw new ArgumentNullException("destination",
Resources.NullDestinationList);
}
if (destinationIndex < 0 || destinationIndex >
destination.Count)
{
throw new ArgumentOutOfRangeException
("destinationIndex", destinationIndex, Resources.IndexOutOfRange);
}
if (destination.Count < destinationIndex + count1 +
count2)
{
throw new ArgumentException("destination",
Resources.CountBiggerThanDestinationList);
}
if (comparison == null)
{
throw new ArgumentNullException("comparison",
Resources.NullComparison);
}
return merge(list1,
index1,
index1 + count1,
list2,
index2,
index2 + count2,
destination,
destinationIndex,
comparison);
}

That's a lot of argument checking! Part of the problem is just the
large number of arguments. Having C# 4's code contracts would
definitely help to cut that down a lot.

The problem with my policy was that for every overload, I was
reproducing the necessary argument checks. That amounts to a lot of
code and a lot of unit testing.

You can see why I am excited for default parameters. That is also why
I have changed my policy (specific to this project) to allow a less-
specified overload to call the most-specified overload. Only the most-
specified overload will do argument checking. That way I get the
benefit of code reuse and a smaller code set. Eventually, when 4 comes
out, I can just eliminate my overloads with default parameters.

It will mean that I am checking arguments that I know are correct, but
who cares? Once I have done that, I might consider eliminating the
private methods.

Just as another strange policy, I don't make any of my private methods
extension methods. I think of that as part of the public interface and
ignore it within my classes. What do you think about that policy?

So, I think our discussion did me some good. I think I will start
doing more argument checking within the guts of my classes. I still
think I will prefer a public/private implementation when resource
management/exception-handling etc. get complicated. I am not too
worried about the overhead of a method call. That's more for my own
sense of readability more than anything.
 
Sorry for clearing the board. I agree with everything you are saying.
Pretty much everyone agrees that pre- and post-conditions should be a
part of your system. At some point in my career I guess I just decided
that the public interface would be where I dealt with argument
checking. I mean, no one really tells you where it is appropriate.

I come from a pretty small shop. Most of the programmers here wouldn't
even bother with argument checking at all. They assume they are the
center of the universe and their classes are simple enough to have
straightforward calling semantics. In general, they are right. We are
pretty inconsistent about how we do error handling, but overall we get
the job done.

However, the more I think about it, the more I realize that when
implementing a very complex class, adding checks throughout the
private code would be very useful for detecting bugs early on.

No surprise, I agree there. :) Of course, as I said before, that
doesn't mean you have to put error-checking everywhere. But doing it
strategically, at choke points where doing it once can ensure all the
code that follows will have the correct inputs, can produce a nice benefit.
You're absolutely right about the performance thing, too. There is no
point in having extremely high-performance code that doesn't work.
Again, I think my many years of working in C++ have permanently made
me too performance happy. If you read a mainstream C++ book, 90% of
the time the author will be talking about some technique that improves
performance ever so slightly. I wish I could get away from that mind
set.

For what it's worth, it's a mindset that really doesn't actually work
all that well in C or C++ either. It leads to too much "tricky" code,
where flaws are much more likely, and maintenance is much more difficult.

The quote about "premature optimization is the root of all evil"
predates C# (and similar languages, like Java) by quite a long while. :)
It doesn't help that I am writing a algorithms library for educational
purposes. The goal of the library is to provide Linq- and STL-like
implementations. I want the code to be straight-forward and correct.
How can I make the code approachable and easily reusable? I don't want
to labor readers with 2 page argument checking. I want to allow them
to go straight to the core code. Today, I was working on an STL-like
merge method. Let me show you what the public method looks like:

[...]
That's a lot of argument checking! Part of the problem is just the
large number of arguments.

_Part_ of the problem? Give me a break; that's practically ALL of your
problem! :)

I mean, you've got a method with a ridiculous number of arguments. It
stands to reason that the method's argument checking will have a
ridiculous amount of code.

That said, your argument checking is not necessarily as efficient as it
could be. At a very slight cost of less specific error messages, you
could combine a series of checks like this:

if (index1 < 0 || index1 > list1.Count)
{
throw new ArgumentOutOfRangeException("index1", index1,
Resources.IndexOutOfRange);
}
if (count1 < 0)
{
throw new ArgumentOutOfRangeException("count1", count1,
Resources.NegativeCount);
}
if (list1.Count < index1 + count1)
{
throw new ArgumentOutOfRangeException("count1", count1,
Resources.CountBiggerThanFirstList);
}

into a slightly shorter, slightly less costly check like this:

if (count1 < 0)
{
throw new ArgumentOutOfRangeException("count1", count1,
Resources.NegativeCount);
}
if (index1 < 0 || list.Count < index1 + count1)
{
throw new ArgumentOutOfRangeException("index1", index1,
Resources.IndexOutOfRange);
}

After all, if "index1 + count1" is beyond the end of the list, whose to
say it's the "count1" that's out of range rather than "index1"? The
real problem is the combination of the two, so unless you're _also_
going to report a specific error for when the "count1" by itself exceeds
the length of the list (making your already lengthy argument checking
even lengthier), why bother worrying about which is actually the
trouble-maker?

There is also the question as to why that particular method requires the
destination list to already have enough room for the merged data. In
the .NET world, an IList having entries in the list is essentially an
implication that there is valid data in those entries. That's why we
use an IList instead of an array, so we don't have to have empty members
of the list.

So one could argue that your API should be fixed so that the destination
IList is simply non-null. You could even skip checking the destination
index, if you defined an out-of-range value to simply mean "at the
beginning/end of the list" (depending on which way it's out of range).
But even if you don't do that, you can eliminate one more argument check
if the method just uses the Insert() or Add() methods rather than
requiring an empty destination list element already be present.

(If you are concerned about performance, the method itself can always do
a little extra work to avoid having to repeatedly shift the data in the
original IList every time a new item is inserted).

Of course, even after those kinds of changes, you'll still have a
ridiculous number of "if" statements checking arguments. But then,
again...the method does have a ridiculous number of arguments to start
with. It's only natural the implementation of the method might have
some ridiculous aspects to it as well. :)
[...]
Just as another strange policy, I don't make any of my private methods
extension methods. I think of that as part of the public interface and
ignore it within my classes. What do you think about that policy? [...]

I'm not sure what you mean. How would a private method be an extension
method? Extension methods have to be public.

In general, my feeling is that extension methods should be limited to
situations where you either need to extend a particular class without
inheriting it, and/or you want to provide a suite of functionality
applicable to an interface (a la the Enumerable class). In particular,
writing extension methods as part of your _own_ class's API doesn't seem
right to me.

Whether that view relates to what you're asking about, I'm not sure.

Pete
 
Peter said:
[...]
[...]
Just as another strange policy, I don't make any of my private methods
extension methods. I think of that as part of the public interface and
ignore it within my classes. What do you think about that policy? [...]

I'm not sure what you mean. How would a private method be an extension
method? Extension methods have to be public. [...]

Except, of course, if they are called only by other static/extension
methods in the same class. Since such a class is always a static class,
I take as granted that's not really an interesting scenario relative to
the discussion.

But, if that's in fact the crux of your policy, and you're talking about
a static class that has as its public API only extension methods, maybe
that explains what I don't understand about your comment.

Pete
 
Peter said:
[...]
[...]
Just as another strange policy, I don't make any of my private methods
extension methods. I think of that as part of the public interface and
ignore it within my classes. What do you think about that policy? [...]
I'm not sure what you mean.  How would a private method be an extension
method?  Extension methods have to be public. [...]

Except, of course, if they are called only by other static/extension
methods in the same class.  Since such a class is always a static class,
I take as granted that's not really an interesting scenario relative to
the discussion.

But, if that's in fact the crux of your policy, and you're talking about
a static class that has as its public API only extension methods, maybe
that explains what I don't understand about your comment.

Pete

Technically, an extension method can be non-public. Since I have all
these silly private implementations, I have to be sure not to place
"this" in the private methods. It would only benefit/degrade the
declaring static class; it is not really worth a discussion.

I try to avoid methods like Merge in my library. I have a version that
works with IEnumerables. For instance,

private static IEnumerable<T> merge<T>(IEnumerable<T>
collection1,
IEnumerable<T> collection2,
Func<T, T, int> comparison)
{
using (IEnumerator<T> enumerator1 =
collection1.GetEnumerator())
using (IEnumerator<T> enumerator2 =
collection2.GetEnumerator())
{
bool hadMore1 = enumerator1.MoveNext();
bool hadMore2 = enumerator2.MoveNext();
while (hadMore1 && hadMore2)
{
int result = comparison(enumerator1.Current,
enumerator2.Current);
if (result < 0)
{
yield return enumerator1.Current;
hadMore1 = enumerator1.MoveNext();
}
else if (result > 0)
{
yield return enumerator2.Current;
hadMore2 = enumerator2.MoveNext();
}
else
{
yield return enumerator1.Current;
yield return enumerator2.Current;
hadMore1 = enumerator1.MoveNext();
hadMore2 = enumerator2.MoveNext();
}
}
while (hadMore1)
{
yield return enumerator1.Current;
hadMore1 = enumerator1.MoveNext();
}
while (hadMore2)
{
yield return enumerator2.Current;
hadMore2 = enumerator2.MoveNext();
}
}
}

This is obviously a private implementation without the argument
checks. I just check the three arguments aren't null.

I provide IList<T> versions of my algorithms for people who "think"
working on array-based data structures are more efficient. In those
implementations I always take a destination IList<T> and assume it is
fixed size (since it could be an instance of Array). I find that the
most of the people (very few) who seem to care for them are STL-
enthusiast who haven't become comfortable with C# yet. Performance
wise they aren't that great because they work on IList<T>, which makes
polymorphic calls. Everyone else who uses these methods simply like
having operations that can apply to both List<T> and Array; it makes
the code base appear more unified. I wonder why they work with Array
at all...

Then again, my library is mostly for educational purposes. Most of the
users I have talked to say that they usually just copy and paste what
they need. It is hard to track what people are actually using. My
guess is that most people avoid 90% of my library anyway.

Some people will always try to program in their favorite language no
matter what compiler they're using. Libraries like NSTL are not
helping people, like myself once, who try to take a great idea from
one language and use it in another. Without proper C++ templates, the
STL simply doesn't work well. The iterator concept doesn't hold water
in C#. Nothing is worse than using a hand-made data structure when a
plain-jane BCL data structure would do the trick! Square peg, round
hole.

Also, I've thought about your suggestion about slimming down my
argument checking before. I am revamping my library right now, so I
might include that change in the next release. You're right that the
error is not specific to the count. I think my original reason had to
do with the fact that 99% of the time people make an off-by-one error
and make the count one too big. Typically, I provide overloads that
compute the count automatically. So, I figured if someone was
explicitly stating the count, then it should be the count I throw an
error on. Otherwise, why would they both being explicit? I don't know,
just more of my moth-ridden logic.
 
[...]
I try to avoid methods like Merge in my library. I have a version that
works with IEnumerables. For instance,

Because I can't resist...

private static IEnumerable<T> Merge<T>(IEnumerable<T> collection1,
IEnumerable<T> collection2, Comparison<T> comparison)
{
using (IEnumerator<T> enumerator1 = collection1.GetEnumerator(),
enumerator2 = collection2.GetEnumerator())
{
bool hadMore1 = enumerator1.MoveNext(),
hadMore2 = enumerator2.MoveNext();

while (hadMore1 || hadMore2)
{
if (hadMore1 &&
(!hadMore2 || comparison(enumerator1.Current,
enumerator2.Current) <= 0))
{
yield return enumerator1.Current;
hadMore1 = enumerator1.MoveNext();
}
else
{
yield return enumerator2.Current;
hadMore2 = enumerator2.MoveNext();
}
}
}
}

I prefer that form, simply because it has less code and I like less code
(less typing, less maintenance). However, note that if you're into
micro-optimization :), making the code smaller is one of the best things
you can do to improve performance. Loop unrolling still has its place
in the right circumstance, but smaller code usually wins these days.

The above should be functionally equivalent to the code you posted, but
of course is more compact. If you're looking to keep your methods small
and simple, I'd think something like that would be the way to go.

The rest of what you wrote is interesting, makes sense, and I have
nothing more to say on it. :) I just felt compelled to share the above
alternative implementation for your Merge() method.

Pete
 
Peter said:
[...]
The above should be functionally equivalent to the code you posted,

Hmmm...I take that back. There's one exception I see now: if both
sequences have a run of elements that are equal in both sequences, your
implementation alternates between the two sequences, while my
implementation will return all the elements from the first sequence
first, then all the elements from the second sequence.

That is, assuming:

T: Value, Name properties
sequence 1: { { 5, "a1" }, { 5, "b1" }, { 5, "c1" } }
sequence 2: { { 5, "a2" }, { 5, "b2" }, { 5, "c2" } }
comparison: (t1, t2) => return t1.Value.CompareTo(t2.Value)

Then the result for yours is:

{ 5, "a1" }, { 5, "a2" }, { 5, "b1" }, { 5, "b2" }, etc.

While the result for mine is:

{ 5, "a1" }, { 5, "b1" }, { 5, "c1" }, { 5, "a2" }, etc.

I would think this should not matter, but if it does, you would have to
tweak the code I posted so that it produces the same output for that
kind of input (but the tweaked version would still be shorter than your
current implementation :) ).

Pete
 
Back
Top