fast access to members of structure

  • Thread starter Thread starter A n g l e r
  • Start date Start date
A

A n g l e r

Hello everybody.

I've got such an aching me question concerning fast & well optimized
access to members of a structure (or class) and ways that VC compiler
handles it. Let's imagine the following structure and an array of its
type:

struct SSth1
{
int nMbr1, nMbr2, nMbr3;
} aSth1[1000];

now, when I access members within a loop, I can do the following:

for (int i=0; i<1000; i++)
{
for (int j=0; j<1000; j++)
{
doSth(j, &aSth1.nMbr1);
doSth(j, &aSth1.nMbr2);
doSth(j, &aSth1.nMbr3);
}
}

The doSth is just a method modifying the given above variables. Now, I
can also do the following:

for (int i=0; i<1000; i++)
{
int* nMbrLoc1=&aSth1.nMbr1;
int* nMbrLoc2=&aSth1.nMbr2;
int* nMbrLoc3=&aSth1.nMbr3;

for (int j=0; j<1000; j++)
{
doSth(j, nMbrLoc1);
doSth(j, nMbrLoc2);
doSth(j, nMbrLoc3);
}
}

I reckon that the second piece of code will be performed faster since
during execution of the code the pointers to the aSth1.nMbr1 will be
evaluated 1000 times rather than 1000*1000 times. Is that right? This is
what I assume. Alas, I do not know, how actually VC 2005 compiler
handles optimization issues like above. Can anyone elaborate a bit on
that? Do I have to follow that second way of coding, or, by any chance
compiler could handle pointer evaluation robustly on itself?


Cheers,
Peter.
 
I reckon that the second piece of code will be performed faster since
during execution of the code the pointers to the aSth1.nMbr1 will be
evaluated 1000 times rather than 1000*1000 times. Is that right? This is
what I assume. Alas, I do not know, how actually VC 2005 compiler
handles optimization issues like above. Can anyone elaborate a bit on
that? Do I have to follow that second way of coding, or, by any chance
compiler could handle pointer evaluation robustly on itself?


There are 2 correct answers:
1) check the asm
2) test

Really, you can discus this until your hairs are grey but the only sensible
things is to measure it, and use real life scenarios.
It is easy enough to contrive examples where some source code optimization
can make a difference, but real-life performance is what matters.

The VC optimizer is pretty good, and when you use profile guided
optimization, it will do things that are beyond any source level optimization.

Without correct measurements, it is impossible to do anything.
 
A said:
Hello everybody.

I've got such an aching me question concerning fast & well optimized
access to members of a structure (or class) and ways that VC compiler
handles it. Let's imagine the following structure and an array of its type:

struct SSth1
{
int nMbr1, nMbr2, nMbr3;
} aSth1[1000];

now, when I access members within a loop, I can do the following:

for (int i=0; i<1000; i++)
{
for (int j=0; j<1000; j++)
{
doSth(j, &aSth1.nMbr1);
doSth(j, &aSth1.nMbr2);
doSth(j, &aSth1.nMbr3);
}
}

The doSth is just a method modifying the given above variables. Now, I
can also do the following:

for (int i=0; i<1000; i++)
{
int* nMbrLoc1=&aSth1.nMbr1;
int* nMbrLoc2=&aSth1.nMbr2;
int* nMbrLoc3=&aSth1.nMbr3;

for (int j=0; j<1000; j++)
{
doSth(j, nMbrLoc1);
doSth(j, nMbrLoc2);
doSth(j, nMbrLoc3);
}
}

I reckon that the second piece of code will be performed faster since
during execution of the code the pointers to the aSth1.nMbr1 will be
evaluated 1000 times rather than 1000*1000 times. Is that right? This is
what I assume. Alas, I do not know, how actually VC 2005 compiler
handles optimization issues like above. Can anyone elaborate a bit on
that? Do I have to follow that second way of coding, or, by any chance
compiler could handle pointer evaluation robustly on itself?


Further to Bruno's comments, does it matter which is faster? Generally,
you should program for efficiency by using well designed algorithms, and
then write the code for these algorithms in the clearest way possible
(obviously sticking to efficient idioms where it doesn't impact
clarity). Only if you find you have bottlenecks for particular bits of
code should you consider 'micro-optimization'. In this particular case,
I imagine that the compiler optimizes it fine, but the only way to be
sure it to compile it with the optimizations you intend to use, and
check the execution time and asm output. But the point is that it would
usually be a wasted exercise, unless that bit of code is a bottleneck.
In the code above, a more likely bottleneck is the implementation of doSth.

Tom
 
Back
Top