Bad enough to be a bug

  • Thread starter Thread starter PLS
  • Start date Start date
P

PLS

I'm amazed at the following generated code.

The environment is:
VS2005 SP1
Debug mode

Declarations
bool different;
fileVersion and MSGFILEVERSION are unsigned int

different = different || header.fileVersion != MSGFILEVERSION;
00672244 movzx eax,byte ptr [different]
0067224B test eax,eax
0067224D jne 00672264
0067224F cmp dword ptr [header],67h
00672256 jne 00672264
00672258 mov dword ptr [ebp-89Ch],0
00672262 jmp 0067226E
00672264 mov dword ptr [ebp-89Ch],1
0067226E mov ecx,dword ptr [ebp-89Ch]
00672274 call @ILT+11940(@_RTC_Check_4_to_1@4) (560EA9h)
00672279 mov byte ptr [different],al

The generated code developes a one byte bool value by storing into a 32
bit int temporary. The value can only be 0 or 1. It then goes through a
truncation check before storing the one byte value into a one byte
boolean variable. Awful.

++PLS
 
I'm amazed at the following generated code.

Debug or optimized? If it's debug, the codegen is usually "awful,"
but that's in order to make it easier to debug it. What optimization
settings are you at?

Nathan Mates
 
Nathan said:
Debug or optimized? If it's debug, the codegen is usually "awful,"
but that's in order to make it easier to debug it. What optimization
settings are you at?

The OP indicated that this was a debug build. For a debug build, the code
is not optimized at all but is, as you point out, intended to be easy for
the debugger to understand. Another consideration is the penalty that you
pay on modern CPUs for doing byte-sized operations - it can be 4x (or more)
slower to do a byte operation than an entire 32-bit word, so the code isn't
as awful as the OP suggests.

-cd
 
Carl Daniel said:
For a debug build, the code is not optimized at all but is, as you point
out, intended to be easy for the debugger to understand. Another
consideration is the penalty that you pay on modern CPUs for doing
byte-sized operations - it can be 4x (or more) slower to do a byte
operation than an entire 32-bit word, so the code isn't as awful as the OP
suggests.

And what's the penalty for the part that the OP said he/she was amazed by?
00672274 call @ILT+11940(@_RTC_Check_4_to_1@4) (560EA9h)

Even in a debug build, the compiler knew that the possible values 0 or 1
don't need a subroutine call to figure out if they fit in a byte.
 
Norman said:
"Carl Daniel [VC++ MVP]"
For a debug build, the code is not optimized at all but is, as you
point out, intended to be easy for the debugger to understand. Another
consideration is the penalty that you pay on modern CPUs for
doing byte-sized operations - it can be 4x (or more) slower to do a
byte operation than an entire 32-bit word, so the code isn't as
awful as the OP suggests.

And what's the penalty for the part that the OP said he/she was
amazed by? 00672274 call @ILT+11940(@_RTC_Check_4_to_1@4)
(560EA9h)
Even in a debug build, the compiler knew that the possible values 0
or 1 don't need a subroutine call to figure out if they fit in a byte.

Compile time. A tiny amount. Really tiny. Granted, it does seem like
something the compiler could have easily omitted even in a debug build.

-cd
 
Back
Top