Adding non-virtual, non-inline method to a class requires all dependent dlls to be recompiled

Nick Bishop · Mar 8, 2004

We have a weird situation, and want to ask whether this is normal, and
if not, the things we should check for.

In our large system, compiled on Win 2000, we have these DLLs:
a.dll
b.dll
c1.dll
c2.dll

Matt Porter · Mar 9, 2004

Yes it is normal.

There's 2 ways to tackle it.

1. Everytime you publish a new version of your DLL, change the file name.
For example MYDLL10.DLL MYDLL11.DLL MYDLL12.DLL, etc..

2. You can manage the exports in the .DEF file. In C++ it's really a pain
though because of the mangled names.

But yes what you are experiencing is normal.

I hope this helps.

Sincerely,
Matt Porter
American Systems
http://www.americansys.com
http://www.photocountry.com

Nick Bishop said:
We have a weird situation, and want to ask whether this is normal, and
if not, the things we should check for.

In our large system, compiled on Win 2000, we have these DLLs:
a.dll
b.dll
c1.dll
c2.dll
.
.
c50.dll
d.exe

a.dll does not depend on anything (except operating system calls).
b.dll depends on a.dll
c1 to c50.dll depends on both b.dll and a.dll
d.exe depends on a.dll, b.dll and uses loadlib to call c1 to c50.dll

The debug build appends a suffix to all these names, so we get
a_d.dll, b_d.dll ... d_d.exe (these suffixes will not be further
mentioned).

For a bug fix, I had to alter b.dll by adding a non-virtual,
non-inline method to a class called All_Data, then alter c2.dll to
call this new method.

I thought I should be able to get away with redeploying b.dll and
c2.dll, and what I thought was c1, c3 to c50.dll could stay the same
(not recompiled). However, I got strange hangs and errors within all
these other DLLs which went away after I recompiled them.

I tried a small version of the problem on a Linux system, with
libb.so,
libc1.so which calls a method from a class in libb.so,
and d (the executable which calls libc1.so).
I compiled/linked it all together, and ran d (the executable) with the
expected result. I then added the non-virtual, non-inline function to
libb.so, and relinked libb.so (only). I then ran d (the executable)
again with the expected result. This proves that adding a
non-virtual, non-inline class method does not make recompilation of
dependent objects necessary, unless those objects, too, were changed
to call the new method.

Post script:
There is another application made up as follows:
e.dll, depends on b.dll and a.dll
f.exe
We also had a bug fix to f.exe, and (we assume) we had to recompile
e.dll because of the same behaviour exhibited by c1 to c50.dll.

We run debug binaries on one of our test servers, and release binaries
on another of our test servers. When we tried to distribute b.dll, c1
to c50.dll, d.exe, e.dll, and f.exe, we copy the debug versions of
these to the debug test server, and release versions of these to the
other test server. The debug version of d.exe was OK, but the debug
version of f.exe hung on startup. The release version of f.exe worked
OK. We eventually found that if we replaced the debug version of
a.dll, the hang problem of f.dll went away.
Summary:
1. We simply can't understand why we had to replace the debug
version of a.dll on the debug test server. I double checked and there
was no change to the source code for a.dll.
2. If something was really wrong with a.dll, then it should have
affected both debug and release versions, and should have affected
d.exe as well.
3. I am suspicious of the need to recompile c1, c3 to c50.dll

Questions:
a. Is there something fundamental about Windows which makes it
necessary to recompile c1, c3 to c50.dll?
b. What are the things to check for?

c. Is it possible that compiling different parts of the application
under slightly different versions of Windows 2000 could introduce any
difficulties?
d. Or perhaps an inconsistency in compiler options between b.dll and
c1 to c50.dll? (although I find it hard to believe it would "stop
working" one day, if the compile options were wrong at the beginning)
e. Any ideas on why we had to replace the debug version of a.dll?

Nick Bishop
email replies ignored. Additional info below.

................................
The environment
1. We are running Windows 2000 everywhere, but some are 2000
professional, and others are 2000 server (some without Visual Studio).
2. The compiler is Visual Studio .NET (the original 2002 version)
(Help About gives Version 7.0.9466, and .NET framwork Version
1.0.3705)
3. The problem with needing to recompile the c2 to c50.dll showed
itself on a debug build compiled and run on 2000 professional
4. The problem with needing to replace the debug version of a.dll
showed itself on a debug build compiled on one W2000 Server, and run
on another W2000 server, that only has the .NET framework installed,
where previous debug builds have run successfully for some time.

................................
The Source Code, for the reduced problem duplicated on Linux.
There are three directories:
b files to compile/link into libb.so
c1 files to compile/link into libc1.so
d file to compile/link into d (the executable)
******** and later on, another one *******
c2 files to compile/link into libc2.so

I compiled/linked the source exactly as presented below for b/, c1/, &
d/, and execute d (the executable) with the expected result.

I then uncomment the void someMethod(const char* const* theList)
method and recompile/relink only All_Data.* -> libb.so, and execute d
again with the expected result.

As a further test, I then add another directory c2 with a function
called int callTheFixedMethod() that calls the new method, and
uncomment the call in main(), compile/link the c2 directory to
libc2.so, recompile/relink d and execute it with the expected result
(NOTE libb.so is still the original)

// ----- b/All_Data.h ---------------------------------------
#ifndef ALL_DATA_H
#define ALL_DATA_H

class All_Data
{
public:
All_Data();
void someMethod();
// void someMethod(const char* const* theList);
private:
int someData;
};

#endif /* ALL_DATA_H */

// ----- b/All_Data.cpp -------------------------------------
#include "All_Data.h"
#include <iostream>

All_Data::All_Data() : someData(0)
{}

void All_Data::someMethod()
{
std::cout << "someMethod(void)" << std::endl;
}

// void All_Data::someMethod(const char* const* theList)
// {
// int i = 0;
// const char* iter;
// for(iter = theList[0]; iter; iter = theList)
// {
// std::cout << iter << std::endl;
// ++i;
// }
// }

// ----- c1/The_Data.h --------------------------------------
#ifndef THE_DATA_H
#define THE_DATA_H

int callSomeMethodOnAll_Data();

#endif /* THE_DATA_H */

// ----- c1/The_Data.cpp ------------------------------------
#include "The_Data.h"
#include "../b/All_Data.h"

int callSomeMethodOnAll_Data()
{
All_Data ad;
ad.someMethod();
return 0;
}

// ----- d/d.cpp ------------------------------------
#include "../c1/The_Data.h"

int main(void)
{
// callTheFixedMethod();
return callSomeMethodOnAll_Data();
}

// ----- c2/Fix_Data.h --------------------------------------
#ifndef FIX_DATA_H
#define FIX_DATA_H

int callTheFixedMethod();

#endif /* FIX_DATA_H */

// ----- c2/Fix_Data.cpp ------------------------------------
#include "Fix_Data.h"
#include "../b/All_Data.h"

static const char* const myList[] =
{ "One", "Two", 0 };

int callTheFixedMethod()
{
All_Data ad;
ad.someMethod(myList);
return 0;
}

Sean Cavanaugh · Mar 9, 2004

Matt said:
Yes it is normal.

There's 2 ways to tackle it.

1. Everytime you publish a new version of your DLL, change the file name.
For example MYDLL10.DLL MYDLL11.DLL MYDLL12.DLL, etc..

2. You can manage the exports in the .DEF file. In C++ it's really a pain
though because of the mangled names.

3. Call LoadLibrary and GetProcAdddress manually.

with #2 an #3 its very handy to extern "C" everything you can to
simplify the naming of everything possible. It should be trivial to
take something like the output of dumpbin /exports foo.dll and
autogenerate some code to do #3 for you, a fill out global function
pointers. Exported member functions are a big pain, the function ptr
overhead is already there in this case, so you might as well make the
function virtual and stop exporting it.

Nick Bishop · Mar 15, 2004

Sean Cavanaugh said:
3. Call LoadLibrary and GetProcAdddress manually.

with #2 an #3 its very handy to extern "C" everything you can to
simplify the naming of everything possible.

#3 is going to be very painful, but I'd like to explore #2 a bit
further. We already use a .DEF file

I must admit, I don't understand the concept of .DEF files, so where
can I get some information about what role they play and how they
operate?

I do know the following:
a. We have a "home-made" gendefs program that is used to generate
the debug version of the .DEF file automatically, and the build
process builds this.
b. This is not used in the release version. We edit the .DEF file
by hand and put in the mangled name, and the next sequence number.
The .DEF file is stored in the source control (visual sourcesafe).
Extract follows:

EXPORTS
?end@?$_Iosb@H@std@@2W4_Seekdir@12@B @ 1 NONAME
...
?_ReScoreBUMS@@YA_NJPADH@Z @ 11513 NONAME
?UnloadAll_SPECIAL@All_Data@@QAEXPBQBD@Z @ 11514 NONAME

The mangled names are not a problem to us: the linker (when trying to
link c2.dll) will complain about "?funny_name_of_blah@some_blah" and
we just pick that up and cut-paste into the .DEF file along with the @
<seq-no> NONAME

Come to think of it, it was in the debug build that I saw the
necessity of replacing all the DLLs (without ever trying the release
build).

Now, is it possible that if I abandon the auto-generation on the debug
side of things, that the problems would eventually go away?

Sorry, what was I experiencing again ...
... some linker problems, or
... a murderous hatred of an American guy that wears glasses?

email replies ignored

Nick Bishop · Mar 15, 2004

In our large system, compiled on Win 2000, we have these DLLs:
a.dll
b.dll
c1.dll
c2.dll
.
.
c50.dll
d.exe

a.dll does not depend on anything (except operating system calls).
b.dll depends on a.dll [...]
Post script:
There is another application made up as follows:
e.dll, depends on b.dll and a.dll
f.exe
We also had a bug fix to f.exe, and (we assume) we had to recompile
e.dll because of the same behaviour exhibited by c1 to c50.dll.

Summary:
1. We simply can't understand why we had to replace the debug
version of a.dll on the debug test server. I double checked and there
was no change to the source code for a.dll.

Thanks to the two people who answered the main question.

Was anybody able to shed light on this other problem of why we had to replace a.dll?

Nick
email replies ignored.