VC++2003: Methods implemented outside class body are never inlined for templates that are instantiat

  • Thread starter Thread starter Felix I. Wyss
  • Start date Start date
F

Felix I. Wyss

Good Afternoon,

I recently noticed that some very simple methods of a template declared and
used in a DLL library get inlined when used by the DLL itself, but not by
other DLLs and EXEs. After some investigating, I narrowed this down to a
very odd behavior (bug?) of the VC++.NET 2003 compiler: If a class that is
declared as __declspec(dllimport) derives from a template, that template's
methods are never inlined, even if declared with the "inline" specifier.
Methods that are defined within the class body are correctly inlined by the
compiler. The same is true to templates explicitly instantiated for a type
as "template class __declspec(dllimport) Template<Type>;".

As this is a bit tricky to explain, consider the following code:

exporttest_dll.h
================

// Header file of DLL

#ifdef BUILDING_EXPORTTEST_DLL
#define EXPORTTEST_DLL __declspec(dllexport)
#else
#define EXPORTTEST_DLL __declspec(dllimport)
#endif

namespace exporttest_dll
{

template<typename TYPE_T>
class Templ
{
public:
Templ(void) : m_value() { }
TYPE_T get(void) const
{
return m_value;
}
void set(TYPE_T value)
{
m_value = value;
}

private:
TYPE_T m_value;
};


template<typename TYPE_T>
class Templ2
{
public:
Templ2(void);
TYPE_T get(void) const;
void set(TYPE_T value);

private:
TYPE_T m_value;
};


template<typename TYPE_T>
inline Templ2<TYPE_T>::Templ2(void) : m_value()
{
}

template<typename TYPE_T>
inline TYPE_T Templ2<TYPE_T>::get(void) const
{
return m_value;
}

template<typename TYPE_T>
inline void Templ2<TYPE_T>::set(TYPE_T value)
{
m_value = value;
}


class EXPORTTEST_DLL Derived : public Templ<int>
{
};

class EXPORTTEST_DLL Derived2 : public Templ2<int>
{
};

template class EXPORTTEST_DLL Templ<long>;
template class EXPORTTEST_DLL Templ2<long>;

} // end of namespace exporttest_dll



main.cpp
==============

// Executable that uses the templates and classes declared by DLL:

#include <exporttest_dll.h>
#include <iostream>

int main(int, char*[])
{
exporttest_dll::Derived test1;
test1.set(1);
std::cout << test1.get();

exporttest_dll::Derived2 test2;
test2.set(2);
std::cout << test2.get();

exporttest_dll::Templ<int> test3;
test3.set(3);
std::cout << test3.get();

exporttest_dll::Templ2<int> test4;
test4.set(4);
std::cout << test4.get();

exporttest_dll::Templ<long> test5;
test5.set(5);
std::cout << test5.get();

exporttest_dll::Templ2<long> test6;
test6.set(6);
std::cout << test6.get();

exporttest_dll::Templ<unsigned long> test7;
test7.set(7);
std::cout << test7.get();

exporttest_dll::Templ2<unsigned long> test8;
test8.set(8);
std::cout << test8.get();
return 0;
}


Building main.cpp with full optimization turned on (/Ob2 /Ogity /O2) results
in the code below. Note that for test1, test3, test5, test7, and test8
the compiler optimizes everything down to a constant.
However, for test2, test4, and test6, the compiler generates explicit calls
to the method specializations exported by the DLL.

Depends shows the exported methods for the specializations of Templ<int>,
Templ2<int>, Templ<long>, and Templ<long>. Thus, the *only* difference
between Templ and Templ2 lies in Templ having the methods inlined in the
class body and Templ2 declared as inline outside the class body, with the
"inline" specifier. Just in case you were wondering: Yes I tried using
__forceinline -- it doesn't make a difference.

Also note that test7 and test8 use a type as template argument for which the
template was not instantiated with __declspec(dllimport) and the compiler
correctly inlines both Templ and Templ2's methods as expected.


int main(int, char*[])
{
exporttest_dll::Derived test1;
test1.set(1);
std::cout << test1.get();
00401000 8B 0D 08 20 40 00 mov ecx,dword ptr [__imp_std::cout
(402008h)]
00401006 83 EC 0C sub esp,0Ch
00401009 53 push ebx
0040100A 56 push esi
0040100B 57 push edi
0040100C 6A 01 push 1
0040100E FF 15 0C 20 40 00 call dword ptr
[__imp_std::basic_ostream<char,std::char_traits<char> >::operator<<
(40200Ch)]

exporttest_dll::Derived2 test2;
00401014 8B 35 68 20 40 00 mov esi,dword ptr
[__imp_exporttest_dll::Templ2<int>::Templ2<int> (402068h)]
0040101A 8D 4C 24 0C lea ecx,[esp+0Ch]
0040101E FF D6 call esi
test2.set(2);
00401020 8B 3D 6C 20 40 00 mov edi,dword ptr
[__imp_exporttest_dll::Templ2<int>::set (40206Ch)]
00401026 6A 02 push 2
00401028 8D 4C 24 10 lea ecx,[esp+10h]
0040102C FF D7 call edi
std::cout << test2.get();
0040102E 8B 1D 70 20 40 00 mov ebx,dword ptr
[__imp_exporttest_dll::Templ2<int>::get (402070h)]
00401034 8D 4C 24 0C lea ecx,[esp+0Ch]
00401038 FF D3 call ebx
0040103A 8B 0D 08 20 40 00 mov ecx,dword ptr [__imp_std::cout
(402008h)]
00401040 50 push eax
00401041 FF 15 0C 20 40 00 call dword ptr
[__imp_std::basic_ostream<char,std::char_traits<char> >::operator<<
(40200Ch)]

exporttest_dll::Templ<int> test3;
test3.set(3);
std::cout << test3.get();
00401047 8B 0D 08 20 40 00 mov ecx,dword ptr [__imp_std::cout
(402008h)]
0040104D 6A 03 push 3
0040104F FF 15 0C 20 40 00 call dword ptr
[__imp_std::basic_ostream<char,std::char_traits<char> >::operator<<
(40200Ch)]

exporttest_dll::Templ2<int> test4;
00401055 8D 4C 24 10 lea ecx,[esp+10h]
00401059 FF D6 call esi
test4.set(4);
0040105B 6A 04 push 4
0040105D 8D 4C 24 14 lea ecx,[esp+14h]
00401061 FF D7 call edi
std::cout << test4.get();
00401063 8D 4C 24 10 lea ecx,[esp+10h]
00401067 FF D3 call ebx
00401069 8B 0D 08 20 40 00 mov ecx,dword ptr [__imp_std::cout
(402008h)]
0040106F 50 push eax
00401070 FF 15 0C 20 40 00 call dword ptr
[__imp_std::basic_ostream<char,std::char_traits<char> >::operator<<
(40200Ch)]

exporttest_dll::Templ<long> test5;
test5.set(5);
std::cout << test5.get();
00401076 8B 0D 08 20 40 00 mov ecx,dword ptr [__imp_std::cout
(402008h)]
0040107C 6A 05 push 5
0040107E FF 15 10 20 40 00 call dword ptr
[__imp_std::basic_ostream<char,std::char_traits<char> >::operator<<
(402010h)]

exporttest_dll::Templ2<long> test6;
00401084 8D 4C 24 14 lea ecx,[esp+14h]
00401088 FF 15 74 20 40 00 call dword ptr
[__imp_exporttest_dll::Templ2<long>::Templ2<long> (402074h)]
test6.set(6);
0040108E 6A 06 push 6
00401090 8D 4C 24 18 lea ecx,[esp+18h]
00401094 FF 15 78 20 40 00 call dword ptr
[__imp_exporttest_dll::Templ2<long>::set (402078h)]
std::cout << test6.get();
0040109A 8D 4C 24 14 lea ecx,[esp+14h]
0040109E FF 15 7C 20 40 00 call dword ptr
[__imp_exporttest_dll::Templ2<long>::get (40207Ch)]
004010A4 8B 0D 08 20 40 00 mov ecx,dword ptr [__imp_std::cout
(402008h)]
004010AA 50 push eax
004010AB FF 15 10 20 40 00 call dword ptr
[__imp_std::basic_ostream<char,std::char_traits<char> >::operator<<
(402010h)]

exporttest_dll::Templ<unsigned long> test7;
test7.set(7);
std::cout << test7.get();
004010B1 8B 0D 08 20 40 00 mov ecx,dword ptr [__imp_std::cout
(402008h)]
004010B7 6A 07 push 7
004010B9 FF 15 14 20 40 00 call dword ptr
[__imp_std::basic_ostream<char,std::char_traits<char> >::operator<<
(402014h)]

exporttest_dll::Templ2<unsigned long> test8;
test8.set(8);
std::cout << test8.get();
004010BF 8B 0D 08 20 40 00 mov ecx,dword ptr [__imp_std::cout
(402008h)]
004010C5 6A 08 push 8
004010C7 FF 15 14 20 40 00 call dword ptr
[__imp_std::basic_ostream<char,std::char_traits<char> >::operator<<
(402014h)]
004010CD 5F pop edi
004010CE 5E pop esi


Thanks,
Felix I. Wyss
Interactive Intelligence, Inc.
 
Felix I. Wyss said:
Good Afternoon, Greetings.
I recently noticed that some very simple methods of a template declared and
used in a DLL library get inlined when used by the DLL itself, but not by
other DLLs and EXEs. After some investigating, I narrowed this down to a
very odd behavior (bug?) of the VC++.NET 2003 compiler: If a class that is
declared as __declspec(dllimport) derives from a template, that template's
methods are never inlined, even if declared with the "inline" specifier.
Methods that are defined within the class body are correctly inlined by the
compiler. The same is true to templates explicitly instantiated for a type
as "template class __declspec(dllimport) Template<Type>;".

As this is a bit tricky to explain, consider the following code:

[Cut 260 lines of C++, text, and assembler output.]

Whew! I think it might be simplest to just take your
word for what is happening since you appear to have
done your homework, (except for reducing the code
to a minimal example!)

First, the 'inline' keyword is strictly an advisory to the
compiler. There is no requirement in the C++ standard
that the compiler inline whatever is so marked, nor is
the compiler precluded from inlining code not so marked.

So 'bug?' cannot be the issue.

One might make an argument (albeit a complex one)
that this is a quality of implementation issue. I will
take the side of the compiler implementers on this one.

Why would you use a DLL if it was not supposed to
be interchangable with newer versions or (in the case
of plugin architectures) a different implementation?
Inlined code from class methods is first and foremost
an implementation detail that is exposed only for the
sake of execution efficiency. DLL's are primarily
motivated by a desire to separate interface from
implementation, and bind the two at nearly the latest
possible moment. (See "delay loading".) So, if the
compiler feature designer(s) elected to not add some
syntax for saying "Bind my implementation into the
DLL client code." while still allowing inlining for the
benefit of the implementation of the DLL itself, I can
see that as an arguably sensible decision.

If you cannot get happy with it, I would suggest that
a few friends with whatever privileged access those
would-be inlined methods enjoyed could be made
to accomplish the same effect. That approach will
degrade encapsulation somewhat, but that clearly
is something you are willing to tolerate.
 
First, the 'inline' keyword is strictly an advisory to the
compiler. There is no requirement in the C++ standard
that the compiler inline whatever is so marked, nor is
the compiler precluded from inlining code not so marked.

Yes, I know. However, you're missing the point! Semantically, the
following
two are equivalent (see C++ standard: 7.1.2-3, 9.3-2, and 9.3-3):

template<typename T>
class Foo
{
public:
T foo(void) const
{
return m_value;
}
T m_value;
};

template<typename T>
class Foo
{
public:
T foo(void) const;
T m_value;
};

template<typename T>
inline T Foo<T>::foo(void) const
{
return m_value;
}

The fact that the VC compiler treats them differently in some cases must be
considered a bug.

One might make an argument (albeit a complex one)
that this is a quality of implementation issue. I will
take the side of the compiler implementers on this one.

Why? If you are taking the side of the implementer, you have to believe
that there is a rationale for treating these two cases differently. This
not
only violates the C++ standard but also the principle of least surprise.

Why would you use a DLL if it was not supposed to
be interchangable with newer versions or (in the case
of plugin architectures) a different implementation?
Inlined code from class methods is first and foremost
an implementation detail that is exposed only for the
sake of execution efficiency. DLL's are primarily
motivated by a desire to separate interface from
implementation, and bind the two at nearly the latest
possible moment.

You are jumping to conclusions from the fact that I had to make a very
simple
example (which you then even find not basic enough) to showcase the problem.
Yes, in this simplistic example, using a DLL is of course overkill.
The problem is that from what I've seen, even if the template is declared in
a separate header file or a static lib, as soon as some class declared with
__declspec(dllimport) derives from that template, the template itself is
instantiated with __declspec(dllimport) and all its member functions are
exported by the DLL. If the writer of that template didn't put the
implementations of the member functions in the class body, they will never
be
inlined due to this bug.

If you cannot get happy with it, I would suggest that
a few friends with whatever privileged access those
would-be inlined methods enjoyed could be made
to accomplish the same effect. That approach will
degrade encapsulation somewhat, but that clearly
is something you are willing to tolerate.

That doesn't make any sense at all. It is akin to saying: "that bug is not
really a bug because there is a workaround for it". As it were, there is
actually a much simpler workaround: put the implementation of the member
functions into the class body. However, that of course only works for code
one has control over (as opposed to third party libraries). Even then,
having to go back and identify cases where it makes sense to move the
implemenation into the body it existing code is rather tedious and error
prone.

--Felix
 
Felix I. Wyss said:
Yes, I know. However, you're missing the point!

That point is central to the "bug?" issue. I will try,
however, to comprehend your other points, as I
did initially (subject to code reading hesitation).
Semantically, the following
two are equivalent (see C++ standard: 7.1.2-3, 9.3-2, and 9.3-3):

template<typename T>
class Foo
{
public:
T foo(void) const
{
return m_value;
}
T m_value;
};

template<typename T>
class Foo
{
public:
T foo(void) const;
T m_value;
};

template<typename T>
inline T Foo<T>::foo(void) const
{
return m_value;
}

Semantically, the inline keyword is invisible unless
it appears in (quite a few) disallowed places.
The fact that the VC compiler treats them differently in some cases must be
considered a bug.

I disagree. Within the latitude granted by the C++ standard
as to when inlining must or must not occur, choices that fall
inside the allowed bounds cannot properly be called bugs.
Why? If you are taking the side of the implementer, you have to believe
that there is a rationale for treating these two cases differently. This not only violates the C++ standard but also the
principle of least surprise.

I deny that it violates the C++ standard. As for the
principle of least surprise, I will grant that you have
a point, but even that is not a given because surprise
occurs in the mind according to its expectations, a
highly variable set. (I've seen no surprise quantifiers!)
You are jumping to conclusions from the fact that I had to make a very simple example (which you then even find not basic enough)
to showcase
the problem.

If you claim that problem cannot be demonstrated
with significantly less code, I will not gainsay you.
Yes, in this simplistic example, using a DLL is of course overkill.

But is not the use of constructs related to DLL creation
at the core of this problem? If not, I misunderstood you.
The problem is that from what I've seen, even if the template is declared in
a separate header file or a static lib, as soon as some class declared with
__declspec(dllimport) derives from that template, the template itself is
instantiated with __declspec(dllimport) and all its member functions are
exported by the DLL. If the writer of that template didn't put the
implementations of the member functions in the class body, they will never be inlined due to this bug.

I can only draw one new fact from that. The rest
appears to repeat your original statement. From
the above, I think you may not know quite where
template code resides before becoming translated
to a concrete, executable form. (But that is a side
issue, I think.)
That doesn't make any sense at all. It is akin to saying: "that bug is not
really a bug because there is a workaround for it".

Well, that might be one interpretation. But what I was
trying to do is provide you a way to deal with the tools
you have, as they are, and get what you want. If you
want to treat it as part of my "not a bug" contention,
please just drop that element because it provides very
poor, pathetic support for that position.
As it were, there is
actually a much simpler workaround: put the implementation of the member
functions into the class body.

I realize that now. Maybe that is the syntax I was
mentioning. And that would be surprising. Maybe
it should be considered a doc bug.
However, that of course only works for code
one has control over (as opposed to third party libraries). Even then,
having to go back and identify cases where it makes sense to move the
implemenation into the body it existing code is rather tedious and error prone.

Well, inlined members for stuff in a DLL are
error prone. Maybe that tedium acts as a
beneficial barrier.
 
Felix said:
Yes, I know. However, you're missing the point! Semantically, the
following
two are equivalent (see C++ standard: 7.1.2-3, 9.3-2, and 9.3-3):

Actually, I don't believe that they are. If the function wasn't originally
introduced with the inline keyword the compiler is free to ignore any
subsequent request to inline it.
template<typename T>
class Foo
{
public:
T foo(void) const
{
return m_value;
}
T m_value;
};

template<typename T>
class Foo
{
public:
T foo(void) const;

Change this to

inline T foo() const;

and your assertion that they're the same is correct, but...
T m_value;
};

template<typename T>
inline T Foo<T>::foo(void) const
{
return m_value;
}

The fact that the VC compiler treats them differently in some cases
must be considered a bug.

I don't think so. Whether a function is inlined or not is not observable
behavior according to the C++ standard, so the compiler's free to inline or
not at its sole discretion - including never inliing or always inlining or
inlining only functions that contain an odd number of source tokens (or
functions

I'll admit it's quirky, and perhaps unexpected, but I don't think it can be
classified as a compiler bug.

-cd
 
Actually, I don't believe that they are. If the function wasn't
originally introduced with the inline keyword the compiler is free to
ignore any subsequent request to inline it.

I'm sorry, but you're wrong. From the standard:

9.3-2
"A member function may be defined (8.4) in its class definition,
in which case it is an inline member function (7.1.2), [...]"

I think you're confusing it with this one:

9.3-3
"An inline member function [...] may also be defined outside of its
class definition provided either its declaration in the class
definition or its definition outside of the class definition declares
the function as inline."

Change this to

inline T foo() const;

Doesn't matter, see above. At any rate, this has no bearing on the issue
at hand anyway. After all, VC *does* inline the member functions if they
are implemented in the template class body (without the "inline" qualifier).
It doesn't inline them if they are defined outside the template class body
with the "inline" qualifier.

I don't think so. Whether a function is inlined or not is not observable
behavior according to the C++ standard, so the compiler's free to inline
or not at its sole discretion - including never inliing or always inlining
or inlining only functions that contain an odd number of source tokens (or
functions

I'll admit it's quirky, and perhaps unexpected, but I don't think it can
be classified as a compiler bug.

Well, that may be technically true, but I consider arbitrarily generating
poor
code for, according to the standard, semantically equivalent language
constructs a bug -- or at least a very dubious "feature" that should be
fixed.

--Felix
 
Back
Top