best practice for variable declaration

  • Thread starter Thread starter Jiho Han
  • Start date Start date
J

Jiho Han

If I have a class, whether it's simple or complex, is it better to declare
it outside of a looping construct?

For example,

CustomClass objVar;

while (somecondition)
{
objVar = new CustomClass(somevariable1, somevariable2);
...
}

vs.

while(somecondition)
{
CustomClass objVar = new CustomClass(somevariable1, somevariable2);
...
}

The question is purely regarding the declaration of the variable inside the
loop construct.
Would the second form have any detrimental effect on either the memory
consumption or the performance?

Thanks
Jiho Han
 
Yes, it is best to declare it once, otherwise I believe the variable is
reallocated every time through the loop.

At the very least, it is better practice to put all your variable
declarations at the top of your method anyway, even if there is no
performance benefit.
 
Jiho Han said:
If I have a class, whether it's simple or complex, is it better to declare
it outside of a looping construct?

Inside - it keeps the variable scope smaller and makes it more obvious
that it's only used in the loop. It has no performance penalties.

There's one exception to this, however: if you need one *object* which
can be used for every iteration of the loop, it will generally be
better to create that once outside the loop and then use it inside the
loop. For instance:

MyExpensiveObject foo = new MyExpensiveObject();
for (int i=0; i < 100; i++)
{
foo.DoSomething();
}

rather than

for (int i=0; i < 100; i++)
{
MyExpensiveObject foo = new MyExpensiveObject();
foo.DoSomething();
}

It really depends on the semantics of the class/object involved though
- usually those will dictate whether or not you *need* to use the same
object or a different object each time.
 
Marina said:
Yes, it is best to declare it once, otherwise I believe the variable is
reallocated every time through the loop.

The variable is on the stack - no memory is allocated each time, only
the value is assigned, which (to get the same semantics) it would be
anyway.
At the very least, it is better practice to put all your variable
declarations at the top of your method anyway, even if there is no
performance benefit.

Yikes no! That makes it far from obvious which variable is used where.
It's also a pain when actually writing code.

I personally prefer "keep the scope as tight as possible, and where you
can, only declare at the point of first use".
 
I am glad to find that there are no performance penaties. I do agree on the
limiting of the variable scope to the relevant section and generally keeping
declarations close to the section in which they are used.

Thanks so much.
Jiho Han

Jon Skeet said:
Inside - it keeps the variable scope smaller and makes it more obvious
that it's only used in the loop. It has no performance penalties.

There's one exception to this, however: if you need one *object* which
can be used for every iteration of the loop, it will generally be
better to create that once outside the loop and then use it inside the
loop. For instance:

MyExpensiveObject foo = new MyExpensiveObject();
for (int i=0; i < 100; i++)
{
foo.DoSomething();
}

rather than

for (int i=0; i < 100; i++)
{
MyExpensiveObject foo = new MyExpensiveObject();
foo.DoSomething();
}

Of course. But an excellent point nonetheless as we sometimes forget to
check for these things.
 
I find code littered with variable declarations in the middle of functions
very difficult to read, it breaks the flow of trying to figure out what's
going on. To me, it just makes the methods look messy and unorganized.

I am confused as to your explanation, I would expect the variable to come
off the stack at the bottom of the loop, and then be pushed back on at the
next iteration of the loop. But if you say so...
 
Jon Skeet said:
The variable is on the stack - no memory is allocated each time, only
the value is assigned, which (to get the same semantics) it would be
anyway.


Yikes no! That makes it far from obvious which variable is used where.
It's also a pain when actually writing code.

I personally prefer "keep the scope as tight as possible, and where you
can, only declare at the point of first use".
Out of curiosity, do you declare anywhere in the body at use or do you mean
at the beginning of hte scope level where the variable is contained?
My own behaviour on this is pretty variant. I put my commonly used variables
at the head of the scope level(at the start of hte if or loop or using or
whatnot when applicable) most of hte time, while I'll often inline declare a
variable that does nothing but hold a return value for an if or as a
parameter to another method.
 
Marina said:
I find code littered with variable declarations in the middle of functions
very difficult to read, it breaks the flow of trying to figure out what's
going on. To me, it just makes the methods look messy and unorganized.

I suspect that may just be force of habit - are you from a C
background, or some other language where that was forced upon you?
I am confused as to your explanation, I would expect the variable to come
off the stack at the bottom of the loop, and then be pushed back on at the
next iteration of the loop. But if you say so...

The stack gets created with the number of slots needed for the whole
method. Its size doesn't get changed during the method. If you look at
the IL generated for the two different places of declaration, you'll
find it's exactly the same.
 
Out of curiosity, do you declare anywhere in the body at use or do you mean
at the beginning of hte scope level where the variable is contained?

Anywhere in the body - or rather, as late as I possibly can. That
usually means at the point where the value is first assigned, unless
I've got something like:

int index;
if (someCondition)
{
// Work out index in one way
}
else
{
// Work out index in another way
}
My own behaviour on this is pretty variant. I put my commonly used variables
at the head of the scope level(at the start of hte if or loop or using or
whatnot when applicable) most of hte time, while I'll often inline declare a
variable that does nothing but hold a return value for an if or as a
parameter to another method.

Right - not the way I do things, but certainly better (IMO) than
putting *all* the variables at the start of the method.
 
Nope, not from a C background at all. I find it just much cleaner to have
all the declarations in one place, and then have the code be purely
functional - as opposed to littered with variable declarations all over the
place.
 
I disagree here. This school of thought is Delphi driven which in turn came
from the pascal world. The horrible abuse of this method is well documented
and I won't go into it here. I'll just say that it is more efficient to
declare variables as close to their use as possible except when conditions
force other behavior. Declaring variables at the top of a routine is
wasteful of resources and pushes uncomfortably close into *global variable*
approaches.

--
Regards,
Alvin Bruney [ASP.NET MVP]
Got tidbits? Get it here...
http://tinyurl.com/27cok
Marina said:
Nope, not from a C background at all. I find it just much cleaner to have
all the declarations in one place, and then have the code be purely
functional - as opposed to littered with variable declarations all over the
place.
 
I disagree here. This school of thought is Delphi driven which in turn came
from the pascal world. The horrible abuse of this method is well documented
and I won't go into it here. I'll just say that it is more efficient to
declare variables as close to their use as possible except when conditions
force other behavior. Declaring variables at the top of a routine is
wasteful of resources and pushes uncomfortably close into *global variable*
approaches.

I don't believe it's actually wasteful of resources at all - could you
give an example of what you mean?

I certainly believe it makes the code harder to understand (and write
in the first place), but I don't believe it has any impact on the
actual resource usage.
 
Consider a variable, a simple type for now, declared up top like so
{
variable here;
//processing done here
//if block with a return statement
//variable used here
}

In the above scenario, the space allocated for variable here is wasted if
the if block executes. It has to be discarded without being used. It would
have been more economical to declare variable here after the if block. For a
simple type allocated on the stack, who cares? This is the trivial case. For
a database connection declared on the heap in a web environment, this is an
entirely different story. As an example, SQLConnection cn = new
SQLConnection(connstring) substituted for variable here actually flags a
connection from the pool as occupied even though it may not necessarily be.
This sort of pattern may lead to resource contention on a website
experiencing heavy load - amazon.com for example. In fact, if the connection
were to be opened right after definition which arguably would most likely be
the case, this could quite certainly lead to a memory leak in the event that
either the if block executes and returns without closing the connection or
an exception is thrown at some point after opening the connection or the
//processing done here block of code takes a relatively long time to
complete. The resource wastefullness of this approach is compounded for more
complex types and custom objects which are very expensive to create such as
in custom types which require object pooling.
 
Consider a variable, a simple type for now, declared up top like so
{
variable here;
//processing done here
//if block with a return statement
//variable used here
}

In the above scenario, the space allocated for variable here is wasted if
the if block executes.

The space for *all* local variables is allocated at the start of the
method, whether or not the variable ends up being used.
It has to be discarded without being used. It would
have been more economical to declare variable here after the if block.

No it wouldn't.
For a simple type allocated on the stack, who cares? This is the trivial case. For
a database connection declared on the heap in a web environment, this is an
entirely different story. As an example, SQLConnection cn = new
SQLConnection(connstring) substituted for variable here actually flags a
connection from the pool as occupied even though it may not necessarily be.

Ah - but that's a different matter. Where the variables are *declared*
isn't relevant - when they're assigned values which require allocation,
that's a different matter. I don't believe anyone has been suggesting
premature allocation of resources, just variable declaration.
 
The space for *all* local variables is allocated at the start of the
method, whether or not the variable ends up being used.
I specifically didn't qualify the variable as local. The argument is without
merit because local variables go away with the stack upon scope exit. My
argument is still valid because the framework provides for declaration and
allocation of heap objects inside a calling routine. The dangerous case is
non-local variables holding references to global objects. My suggested
pattern avoids this issue because definition and declaration are at the
point of use which occurs in the narrowest possible scope.
I don't believe anyone has been suggesting
premature allocation of resources, just variable declaration.
Strictly speaking, variable declaration is not a problem which is the
trivial cases. However, the framework does provide for declaration IS
definition syntax. A guarantee cannot be made that a declaration is and
always will be just a declaration. In fact, more often than not, declaration
is definition syntax is common and widespread. So just the suggestion of
variable declaration is suspect.

I'm going out on a limb here, but I believe that declaration is definition
for simple intrinsic types. The compiler already knows the size and type of
the variable and promptly allocates space for it even though it is a
declaration. I'm aware that they aren't initialized, does that necessarily
imply that space was not allocated? This is the case for C++ unmanaged. I am
not 100% sure whether it applies to .NET and managed C++ but I can't see why
not as it would be an excellent optimization strategy. If I am wrong, I
stand corrected.

You will find that as long as there are cases of declarations being
definitions, this pattern should rightly be avoided. I extend this argument
to the non-qualified trivial case as well. It should be avoided period.
There is absolutely no value - that I know of - to this pattern. It's a
relic from an old era with the potential to introduce misbehaving code.
 
I specifically didn't qualify the variable as local.

As far as I'm aware, local ones are the only ones being discussed up to
this point in this thread. Certainly I'd been only talking about local
variables.
The argument is without
merit because local variables go away with the stack upon scope exit. My
argument is still valid because the framework provides for declaration and
allocation of heap objects inside a calling routine. The dangerous case is
non-local variables holding references to global objects. My suggested
pattern avoids this issue because definition and declaration are at the
point of use which occurs in the narrowest possible scope.

Certainly variables which are logically local shouldn't be declared as
instance or static variables - but I don't believe anyone had suggested
doing that.
Strictly speaking, variable declaration is not a problem which is the
trivial cases. However, the framework does provide for declaration IS
definition syntax.

Yes, but I don't believe anyone was actually suggesting assigning
values before the appropriate time though.
A guarantee cannot be made that a declaration is and
always will be just a declaration.

No, but you seem to be criticising a style which puts declarations at
the top of the method *on the assumption* that assignment also happens
early.
In fact, more often than not, declaration
is definition syntax is common and widespread. So just the suggestion of
variable declaration is suspect.

Not really - that's all that was being suggested.
I'm going out on a limb here, but I believe that declaration is definition
for simple intrinsic types.
Nope.

The compiler already knows the size and type of
the variable and promptly allocates space for it even though it is a
declaration.

The space is allocated for *all* local variables at the start of a
method, wherever they're declared.
I'm aware that they aren't initialized, does that necessarily
imply that space was not allocated?

No - see the above.
This is the case for C++ unmanaged.

I'm not sure that's necessarily the case, actually, but I'm not
sufficiently sure to take issue with you.
You will find that as long as there are cases of declarations being
definitions, this pattern should rightly be avoided. I extend this argument
to the non-qualified trivial case as well. It should be avoided period.
There is absolutely no value - that I know of - to this pattern. It's a
relic from an old era with the potential to introduce misbehaving code.

There are certainly reasons (IMO) to avoid it for *readability*
reasons, but you specifically gave a *performance* reason, which
doesn't apply if the variables are only *declared* early. There was no
indication in Marina's of early assignment, so her style as described
so far cannot be criticised on performance reasons.
 
I may have read too much into Marina's post then.

Marina, were you only talking about declaration or was there even a hint in
there - even remotely - of definition at the top? (serious grasping going
on)
 
Back
Top