weird VC.NET 2003 compiler behaviour (for C++ windows DLL based project)

  • Thread starter Thread starter Daniel Yelland
  • Start date Start date
D

Daniel Yelland

Hi,

I have developed a number of code libraries in Win32 DLLs and have written a
number of test suite executables that implicitly link to these libraries in
order to test them. In one of my test applications, which runs fine in Debug
mode, it is crashing in the destructor of a local object on the stack when
it is built in release mode.

An example of the C++ that causes the problem is as follows (apologies for
the contrived example): -

CTestSmartPointer : public CSmartPointer
{
public:

CTest() {;}
~CTest() {;}

void ReleaseObject() {throw _T("Released Object!");}
}

CTestSmartPointer kSmartPointer;

unsigned int uiIndex = 0;

for(uiIndex = 0; uiIndex < MAX_REFERENCES; uiIndex++)
{
kSmartPointer.IncrementReferenceCount();
}

try
{
for(uiIndex = 0; uiIndex < MAX_REFERENCES; uiIndex++)
{
kSmartPointer.DecrementReferenceCount();
}
}
catch(const TCHAR *)
{
bTestPassed = true;
}

CSmartPointer is a base class defined in one of the DLL libraries. The
exception is thrown as expected and the test is marked as passed. Once the
function exits, though the program crashes entering the destructor for the
CSmartPointer base class.

Now for the weirdness...

In attempting to track this problem down, I decided to pad the function with
_asm nop instructions to spot potential stack corruption. This fixed the
problem, but when I looked at the generated code I got a surprise...

Generated code with one trailing _asm nop at the end of the function: -

lea ecx,[kSmartPointer]
mov dword ptr [ebp-4],0FFFFFFFFh
mov dword ptr [kSmartPointer],offset

`CSmartPointerTestElement::RunSmartPointerTests'::`2'::CTest::`vftable'
(4032C0h)
call CSmartPointer::~CSmartPointer (402304h)

Contents of ECX = 0012FEC0
Address of object = 0012FEC0

Generated code with no _asm nop instructions: -

mov dword ptr [ebp-4],0FFFFFFFFh
mov dword ptr [kSmartPointer],offset

CSmartPointerTestElement::RunSmartPointerTests'::`2'::CTest::`vftable'
(4032C0h)
call CSmartPointer::~CSmartPointer (4022F4h)

Contents of ECX = 7C359270
Address of object = 0012FEC0

This problem does not occur when global optimisation are turned off for the
function (#pragma optimize("g" off)), so I assumed the compiler was making
assumptions about the contents of the ECX register not changing even in the
presence of exceptions (ECX is loaded with the address of the smart pointer
earlier in the file), however what I don't understand, is why adding an _asm
nop would cause the generation of another load instruction.

Is there anything I could be doing wrong in either my project settings or
C++/DLL declarations that could cause this behaviour?

If anyone has any information into the cause or resolution of this bug, it
would be greatly appreciated.

Regards,

Daniel Yelland
 
I have developed a number of code libraries in Win32 DLLs and have written a
number of test suite executables that implicitly link to these libraries in
order to test them. In one of my test applications, which runs fine in Debug
mode, it is crashing in the destructor of a local object on the stack when
it is built in release mode.

Daniel,

Before going into the depths, let's check the basics...

Does the problem not occur in your release build if you don't use the
global optimisation option?

If it occurs regardless of that setting, is it likely to be that
you're allocating the object in another DLL which is using a different
run-time heap?

Cross-module allocation & deletion only works when all modules are
built to use the same common run-time heap. This means they all have
to use the DLL version of the 'C' run-time, and you can't mix debug
and release components.

The alternative solutions are to ensure both modules use a common
memory allocation scheme (such as GlobalAlloc), or not to have the
situation arise - always ensure that whoever allocates the object ends
up freeing it.

Dave
 
The problem does not occur if global optimisations are turned off for the function. The object is allocated on the stack of the test executable. There is no dynamic allocation of memory in the example. The executable and all of the DLLs in the project use the same version of the run-time (Multithreaded DLL for release, Multithreaded Debug DLL for debug builds).

I'm pretty sure it's not a memory allocation problem. The function only crashes on exit if an exception is thrown by the object. When the exception is caught in the catch block, the contents of the ECX register has changed and is incorrect for the subsequent destructor call of the local object.

Here's a quick code example of what I mean: -

// In the DLL
class EXPORT_DECL CBase
{
protected:

CBase() {;}
~CBase() {;}

virtual void Function() {;}
};

// In the test application
class CObject : public CBase
{
public:

CObject() {;}
~CObject() {;}

void Function() { throw _T("Exception!"); }
};

void Test(void)
{
CObject kObject;

try
{
kObject.Function();
}
catch(...)
{
bExceptionThrown = true;
}
}

The code crashes in the destructor of the CObject class, before the function returns to the caller. One thing I may not have mentioned is that CObject derives from a base class (CBase) defined in one of the DLLs, Function() is virtual, and has been overridden in the CObject class to throw the exception. It is this exception throwing that triggers the crash.

Any further ideas greatly appreciated,

Daniel.
 
The problem does not occur if global optimisations are turned off for the function.

OK, then it looks like a global optimisation issue.

Do you have a means of reproducing it with a stand-alone example -
something anyone can easily build and repro?

Dave
 
I'll try to create a simple example that illustrates the bug. I'll post some
example source when it's done (probably a couple of days when I get some
spare time to investigate it further).

Thanks,

Dan.
 
Here is some really simple test code that exhibits the weird behaviour (this
was built as a Win32 console app): -

TestBase.h

class CTestBase
{
public:

virtual void VirtualFunctionCaller() {VirtualFunction();}

protected:

CTestBase() {;}
virtual ~CTestBase();


virtual void VirtualFunction() const{;}
};

TestBase.cpp

#include "TestBase.h"

CTestBase::~CTestBase()
{
// Empty
}

Test.cpp

#include "TestBase.h"

class CTestDerived : public CTestBase
{
public:

CTestDerived() {;}
~CTestDerived() {;}

void VirtualFunction() const { throw true; }
};

bool TestFunc()
{
bool bTestPassed = true;

CTestDerived kTest;

unsigned int uiReferenceCount = 5;

try
{
for(unsigned int uiIndex = 0; uiIndex < 5; uiIndex++)
{
uiReferenceCount--;

kTest.VirtualFunctionCaller();
}
}
catch(bool)
{
bTestPassed &= (uiReferenceCount==0);
}

return bTestPassed;
}

int main(int, char *[])
{
TestFunc();

return 0;
}

Now for some more information...

If the base class destructor is inlined, then the code works fine.
If I remove the loop the code works fine. As part of the loop condition the
compiler loads the ECX register with the address of the local object. When
an exception is thrown, and the ECX register has changed, there is no
instruction to reload the contents of the register and the function crashes
when it calls the destructor.

If the loop is removed, the ECX register is never initialised to the address
of the local object prior to the exception, and a load instruction is
generated before the destructor is called and everything runs fine.

The type of exception thrown, or the code inside the exception handler makes
no difference to the bug.

So at least now I know it is nothing specifically to do with DLLs at least
:)
 
Here is some really simple test code that exhibits the weird behaviour

Daniel,

I can reproduce the crash when I use /Og with VS2003 (VC7.1).

However, the problem doesn't exist with the VS2005 beta 1 compiler, it
looks like MS must have fixed it.

You should note that the newer compiler reports:

cl : Command line warning D9035 : option 'Og' has been deprecated and
will be removed in a future release

.... but rather than not make any difference, I have checked the
assembly code with and without /Og, and the generated code is quite
different, so it still seems to be doing something - that's why I
conclude that it looks as though MS must have fixed the issue.

Dave
 
Thanks for investigating this David. I compiled the example code using the
Visual C++ 6 and Intel 7.1 compilers and had no problems either.

It's good to know the problem will be fixed in a future release of the
compiler.

Thanks again,

Dan.
 
Thanks for investigating this David. I compiled the example code using the
Visual C++ 6 and Intel 7.1 compilers and had no problems either.

It's good to know the problem will be fixed in a future release of the
compiler.

If you need definite confirmation, you should contact MS PSS (phone)
and report the problem. You shouldn't be charged for reporting a bug.

Dave
 
Back
Top