In comp.os.linux.advocacy, Jeff_Relf
<
[email protected]>
wrote
Hi Ernestine, Ya wrote: << I wish I knew what ya all were talkin' 'bout,
Precedence ain't got not'n to do with what beer I drink. >>
When coding, it helps to know the order of execution,
Spooky apparently has no clue about that.
I have more than a clue than you might think.
The HP PA Risc architecture, for instance (a Unix
workstation running HP/UX at one point which I had the
privilege of using for some years in my prior job --
nice machine, actually), had some very interesting ideas
of exactly when to execute an instruction, for example.
Basically, the processor had the option (and usually took
it) of executing the instruction following a conditional
branch, even though the branch was taken.
This presumably gave compiler optimizers fits.
http://h21007.www2.hp.com/dspp/tech/tech_TechDocumentDetailPage_IDX/1,1701,958!33!250,00.html
<excerpt>
pa-risc 1.1 architecture and instruction set reference manual
[...]
concept of delayed branching
All branch instructions exhibit the delayed branch
feature. This implies that the major effect of the branch
instruction, the actual transfer of control, occurs one
instruction after the execution of the branch. As a result,
the instruction following the branch (located in the delay
slot of the branch instruction) is executed before control
passes to the branch destination. The concept of delayed
branching is illustrated in Figure 42.
Execution of the delay slot instruction, however, may be
skipped ("nullified") by setting the "nullify" bit in the
branch instruction to 1.
delayed branching
program segment
Location Instruction Comment
100 STW r3, 0(r6) ; non-branch instruction
104 BLR r8, r0 ; branch to location 200
108 ADD r7, r2, r3 ; instruction in delay slot
10C OR r6, r5, r9 ; next instruction in linear code
; sequence
.. .
.. .
.. .
200 LDW 0(r3), r4 ; target of branch instruction
execution sequence
Location Instruction Comment
100 STW r3, 0(r6) ;
104 BLR r8, r0 ;
108 ADD r7, r2, r3 ; delay slot
; instruction is executed before
200 LDW 0(r3), r4 ; execution of
; target instruction
</excerpt>
Also, the compiler, when presented with your example (paraphrased):
int swap32(int v)
{
unsigned char * p = (unsigned char *) &v;
p += 4;
return (*--p << 24)|(*--p << 16)|(*--p << 8)|(*--p);
}
has the option of evaluating the expression either
left-to-right or right-to-left, at the whim of the compiler
designer(s). (Unless Kelsey tells me different -- he's more
familiar with the C/C++ specifications than I am.
)
However, some of the best optimization is done within
the compiler by reading the code *backwards* -- e.g., the
compiler might forbear a final store for a local variable
to save time, as the variable will just be discarded
later. In fact, the better compilers don't bother with
storing local variables at all, if they don't need to and
sufficient registers are available. There are a number of
other possibilities, some of which I've already alluded to.
I'm also familiar with some of the problems facing optimizers.
Briefly put, the code
int * selectAPointer(int * ap, int * bp, int * cp);
int doSomething()
{
int a = 1, b=2, c=3, *p;
int d1 = a+b+c;
p = selectAPointer(&a,&b,&c);
*p ++;
int d2 = a+b+c;
}
would have to be very carefully handled, as p might very
well point to a, b, or c after selectAPointer() returns,
or it might point to somewhere compiletely different -- but
there's no way for the compiler to tell here. Therefore,
both d1 and d2 will probably have to be explicitly coded as
something along the lines of:
MOV a(SP), r1
ADD b(SP), r1
ADD c(SP), r1
MOV r1, d1(SP)
Without the selectAPointer() call, the compiler has
the option of *merging* d1 and d2, conceptually (using
a method called "common subexpression elimination"), and
potentially making the code more efficient -- if only
by 4 instructions and a now-redundant stackframe location.
It might also remove the last instruction, and keep
d1 in a register for the length of the code.
I bring all this up to illustrate that your pattern of
linear thinking, generally desirable though it is, has
some drawbacks.
Be clear in your coding now, and your coding will be
clear to you later, when you reread it.
It will
also be clear to the compiler, when *it* reads it.
Of course, knowing the way you code, the code may very
well be clear to you already; just don't expect it to
make sense to the rest of us, without careful analysis.
The ideal, at least to a manager's thinking, is to shuffle
people around at need between projects, both to keep them
interested and to maximize utilization of resource --
this may require reading and modifying someone else's code.
There is at least one hacker [according to Stephen Levy,
anyway] out there who writes extremely ingenious -- most
would say incomprehensible -- code with nary a comment.
I would subscribe to that notion myself, except that
most machine code is pretty much as one might see from
the foregoing: without the comments, one might as well be
looking at gobbledygook, although the machine "understands"
it perfectly well as long as it knows which registers to
modify and which memory locations to wiggle. PA Risc is
slightly unusual in its "delay slot", but is otherwise
more or less typical regarding machines, though it has
more registers (32 at the user level; I can't say I played
with the space registers all that much) than one's typical
Intel hardware, and is probably better designed from an
instruction standpoint, since it didn't have to worry about
8080 source code compatibility. And even then, the above
is *assembly*; the actual machine code would be little more
than a bunch of hex numbers or charge packets.
It was once common for system programmers to pore over
pages of printouts dumped from a console in old IBM
hardware, for example (fortunately, that was before
my time). Of course, that was also when an IBM machine
had all of maybe 8K or 16K of RAM.
I've also caught MS VC++ (version 5.0) in a bona fide
memory corruption condition. I'll admit I'm not sure who
created the bug (as it was running on NT3.5 or NT4, whose
memory allocation was slightly suspect) but the assembly
listing from a section of code I was working on at the
time was quite corrupted for the space of about 10 lines.
As Ronald Reagan once said, "trust but verify".
However, compiler mangling is not the first thing that
comes to mind when one's code does the goofy, though --
most likely, it's one's code.
Unable to recognize a simple coder, he also has no clue about who I am.
There are more things in Heaven and Earth, Horatio, than are dreamt of
in your philosophies.
-- Hamlet, Act I, Scene 5