Ordering of #includes

  • Thread starter Thread starter Ben Taylor
  • Start date Start date
B

Ben Taylor

Hi,
Coming from VB, I've still not really grasped the way how
in C++ if function A wants to call function B, then
function B has to be before function A in the
compilation, and #includes that use the classes in others
rely on others have to be after them accordingly.

I've got three questions about this if I may,

1) When I create an MFC application by having a dialog
class and an application class (generated by the wizard)
as the first four files (CMyDlg.h, CMyDlg.cpp, CMyApp.h
and CMyApp.cpp) and I want to add another class,
CMyClass, which of the four wizard generated files I
mentioned should I #include it in normally? And do I also
have to include the CMyClass.cpp file - or if not, does
it know to pick this file up automatically for the
compilation of the implementation of CMyClass's functions?

2) I experienced strange behavior in VC++.net 2002 when I
wanted to have some #define statements that largely
related to an extra class I added, but would be needed a
bit by my CMainDlg class in its .cpp file aswell, and the
compiler would only accept it if I put them in the added
class's header file - how did it not work if they were
before all the #includes? What is the general rule for
how you should generally order and place the #include and
define statements if you are building an application that
has two main classes for the dialog and application
objects and lots of others, and you want to keep them in
separate files?

3) Where should global functions that some of the classes
might want to call be, and where should they be if those
global functions want to refer to objects that are
instantiated from classes that are defined in different
files?


I'd be very thankful if somebody could give me an
explanation here as it's beginning to confuse me a bit
and making my project layout a bit messy as I just have
to keep moving them around till the compiler hasn't
got 'so-and-so not declared' beef.

Many thanks
Best wishes
Ben
 
Ben said:
Hi,
Coming from VB, I've still not really grasped the way how
in C++ if function A wants to call function B, then
function B has to be before function A in the
compilation, and #includes that use the classes in others
rely on others have to be after them accordingly.

Important concepts to keep in mind:

Compilation model.

VB (and C# and Java) use a "whole program" compilation model. That is, the
compiler (at least logically if not in fact) looks at the entire program
(assembly, DLL, etc) at once.

C and C++ use a "single module" compilation model. In this model, the
compiler sees a single module at a time, and has no concept or knowledge of
what's in any other module. By convention, a module is a .C or .CPP file,
but really it's whatever file is supplied to the compiler on the command
line - there's nothing special about a ".cpp file" as opposed to a "header
file".



Declarations versus definitions.

Declaration: declares to the compiler that a thing (class, function,
object, etc) exists. Provides all the information necessary to use the
thing (i.e. call a function, reference an object, etc). Analogous to the
"interface" of the thing.

example: a function prototype is a declaration.

Definition: Supplies the details about how the compiler is to build the
thing. A definition which does not match a prior declaration is also
considered a declaration. Analogous to the "implementation" of the thing.

example: a function header with a body is a definition.

In languages like VB, C# and Java, there is no separation of declaration
from definition. Since these languages use a "whole program" compilation
model, the compiler can simply search all of the modules of the program for
the declaration of anything that it needs.

In C and C++, since the compiler only sees one module at a time, it cannot
find the definitions of things that are referenced in one module but defined
in another. You have to help it by providing definitions for everything
that's referenced, even if it's not defined in that particular module. By
conveion, the way you supply definitions is through "header files", but
that's a convention only.


The One Definition Rule

In C and C++ there's an important rule: The One Definition Rule. Also
referred to as "The ODR". This states that a thing (function, class, etc)
must have a single definition within a complete program or the results are
undefined. Note that's a single definition, not a single declaration!
There are loopholes for things like inline functions that can, in effect, be
defined several places. The behavior of the program is still undefined if
those defitions are in fact different.


Preprocessing

VB, C#, Java and similar languages have no concept of a preprocessor in the
way that C/C++ does. The preprocessor combines input from one or more files
into a "token stream", which is what the compiler proper consumes. That is,
the compiler itself really doesn't know anything about files or file names
or naming conventions - it only sees the single linear token stream that the
preprocessor supplies.



Single-pass

C and C++ compilers logically pass through the source code one time, in
order, from top to bottom. In theory, for example, a C++ compiler could
receive it's token stream through a TCP socket from the preprocessor, which
could, in theory, be running on a separate machine.


Now, with those foundation concepts in mind:

I've got three questions about this if I may,

1) When I create an MFC application by having a dialog
class and an application class (generated by the wizard)
as the first four files (CMyDlg.h, CMyDlg.cpp, CMyApp.h
and CMyApp.cpp) and I want to add another class,
CMyClass, which of the four wizard generated files I
mentioned should I #include it in normally? And do I also
have to include the CMyClass.cpp file - or if not, does
it know to pick this file up automatically for the
compilation of the implementation of CMyClass's functions?

The normal MFC programming convention would be to make two new files:
CMyClass.h and CMyClass.cpp. You'd put the declaration of CMyClass in the
header file and the definition in the .cpp file, then #include CMyClass.h in
the appropriate place(s). If, for example, CMyDlg uses CMyClass in it's
definition (suppose CMyDlg has a data member of type CMyClass), then you'd
#include CMyClass.h in CMyDlg.h. If the declaration of CMyClass is only
needed by (for example), the definition of CMyApp, then you'd normally
#include CMyClass.h in CMyApp.cpp and not in CMyApp.h. Note that a common
MFC convention is to #include everything in stdafx.h, #include nothing in
other header files, and #include StdAfx.h in all .cpp files. This somewhat
simulates the "whole program" compilation model, since every module contains
a declaration of everything in the program, even if it's not needed.
2) I experienced strange behavior in VC++.net 2002 when I
wanted to have some #define statements that largely
related to an extra class I added, but would be needed a
bit by my CMainDlg class in its .cpp file aswell, and the
compiler would only accept it if I put them in the added
class's header file - how did it not work if they were
before all the #includes? What is the general rule for
how you should generally order and place the #include and
define statements if you are building an application that
has two main classes for the dialog and application
objects and lots of others, and you want to keep them in
separate files?

Generally, you should place things (#defines, definitions, declarations,
etc) in a file that's included in the correct place(s). If the "thing" is a
definition, it needs to be placed in a file that will not violate the ODR.
For example, function defintions and class definitions should be placed in a
"cpp file", unless they're inline, in which case they should be placed in a
"header file". (Because an inline definition must be visible at every point
that it's referenced).
3) Where should global functions that some of the classes
might want to call be, and where should they be if those
global functions want to refer to objects that are
instantiated from classes that are defined in different
files?

Global functions should be avoided generally - consider making them static
class members if that's appropriate. If you need something to be global,
put it in the same file with a class implementation to which it's closely
related, or in a file by itself. Put the declaration of the function in a
header file that is included in the appropriate place(s). That might be in
the same header file with the declaration of a closely related class, or it
might be in a header file by itself, or it might be in a file with other
globals, etc - the possibilities are endless.
I'd be very thankful if somebody could give me an
explanation here as it's beginning to confuse me a bit
and making my project layout a bit messy as I just have
to keep moving them around till the compiler hasn't
got 'so-and-so not declared' beef.

I hope that helps a bit. You might want to get (or look at) the books
"Writing Solid Code" and/or "Large Scale C++ Software Design", both of which
have in-depth discussions of and recommendations about how to divide C/C++
source code into files.
Many thanks
Best wishes
Ben

-cd
 
Ben said:
1) When I create an MFC application by having a dialog
...
CMyClass, which of the four wizard generated files I
mentioned should I #include it in normally? And do I also
have to include the CMyClass.cpp file - or if not, does
it know to pick this file up automatically for the
compilation of the implementation of CMyClass's functions?

Put #include "MyClass.h" in every source (.cpp) file that uses CMyClass.

IMPORTANT: If you have some member variable of type CMyClass (as
opposed to pointer to CMyClass, or CMyClass*) then you have to put
#include in the respective header file, instead of source file.
2) I experienced strange behavior in VC++.net 2002 when I
wanted to have some #define statements that largely
related to an extra class I added, but would be needed a
bit by my CMainDlg class in its .cpp file aswell, and the
compiler would only accept it if I put them in the added
class's header file - how did it not work if they were
before all the #includes? What is the general rule for
how you should generally order and place the #include and
define statements if you are building an application that
has two main classes for the dialog and application
objects and lots of others, and you want to keep them in
separate files?

If you define some macro (or as you call it a define statement) that
is used in another file too, then the definition must precede it's
use. Beware that the macro should then be defined in header file, and
not in source file.

I first include "Stdafx.h" (if I use it, i.e. if I use precompiled
headers), then additional system headers that are not already in
precompiled header, then main application header, in your case
"MyApp.h", then the corresponding header for the source currently
compiling (the source I put all these includes), then all my headers
that the current source needs. This seems to be The Safe Way (at least
to me).
3) Where should global functions that some of the classes
might want to call be, and where should they be if those
global functions want to refer to objects that are
instantiated from classes that are defined in different
files?

You have all the freedom you want (and much more) to place functions,
whether they are global or belongs to a class. You could put all the
globals in the file called Globals.cpp just so you would know where to
find them. If some global function is very closely related to some
class then you could put it in the source file of that class, and even
make it static member function of that class.
 
Ben said:
yes, that helps a lot thanks Carl.
So it doesn't really matter then if .h files get included
more than once and thus the same data getting sent to the
compiler from the preprocessor more than once, because
it's only a header file and shouldn't take up much space
anyway, or will it ignore definitions that it's already
seen before?

It may or may not matter if a header file gets included more than once,
depending on what's in it.

Usual practice (again, see "Writing Solid Code") is to use an "include
guard" around the contents of each header file

// myfile.h
#ifndef included_myfile_h
#define included_myfile_h

// balance of myfile.h

#endif

The effect of the guard is to make the header file "idempotent" - which
means that including it twice has the same effect as including it once.

But as for igorning anything - absolutely not. The compiler will examine
everything that's in the preprocessed token stream and act on it. If
something's duplicated, it may or may not be an error, depending on what it
is: remember that declarations can be repeated, while definitions cannot.

Some examples:

void f(); // declaration - can be repeated.

void f() { } // definition - cannot be repeated (ODR).

class X; // declaration - can be repeated

class X
{
void xf();
}; // definition of class X - can't be repeated

void X::f() {} // definition - can't be repeated

inline void g() {] // definition - can be repeated since it's inline

Also I'm not that clear on Static member functions. If I
want a class to have a static member function, how exactly
do I define it and what differences
(advantages/disadvantages) does it have over non-static
member functions? Is it that it can be called without
having a reference to an object of the class, and if so,
how will it know if it can use 'this' or not?


A static member function is, in effect (and in actuality with every compiler
that I know of) nothing more than an ordinary "global" function with a funny
name. So, calling a static member function does not require an instance of
the class, and there is no 'this' pointer visible within the body of a
static member function. You use static member functions to control the
visibility of functions that don't need a 'this' pointer.

-cd
 
I'm not going to buy any more books for the moment as I've
already got three, one that's on MFC and is about three
inches thick and yet another one on the way from Amazon,
but that's a jolly good idea, thanks - I've now got two
options. Would you say this was better than having all the
includes in stdafx.h, and having all cpp files include
stdafx.h?
Also, a static member function can't be overriden can it?
I didn't think it could because in the class wizard if you
tick virtual it unchecks static, but I just wondered.


Thanks
Ben

-----Original Message-----
Ben said:
yes, that helps a lot thanks Carl.
So it doesn't really matter then if .h files get included
more than once and thus the same data getting sent to the
compiler from the preprocessor more than once, because
it's only a header file and shouldn't take up much space
anyway, or will it ignore definitions that it's already
seen before?

It may or may not matter if a header file gets included more than once,
depending on what's in it.

Usual practice (again, see "Writing Solid Code") is to use an "include
guard" around the contents of each header file

// myfile.h
#ifndef included_myfile_h
#define included_myfile_h

// balance of myfile.h

#endif

The effect of the guard is to make the header file "idempotent" - which
means that including it twice has the same effect as including it once.

But as for igorning anything - absolutely not. The compiler will examine
everything that's in the preprocessed token stream and act on it. If
something's duplicated, it may or may not be an error, depending on what it
is: remember that declarations can be repeated, while definitions cannot.

Some examples:

void f(); // declaration - can be repeated.

void f() { } // definition - cannot be repeated (ODR).

class X; // declaration - can be repeated

class X
{
void xf();
}; // definition of class X - can't be repeated

void X::f() {} // definition - can't be repeated

inline void g() {] // definition - can be repeated since it's inline

Also I'm not that clear on Static member functions. If I
want a class to have a static member function, how exactly
do I define it and what differences
(advantages/disadvantages) does it have over non-static
member functions? Is it that it can be called without
having a reference to an object of the class, and if so,
how will it know if it can use 'this' or not?


A static member function is, in effect (and in actuality with every compiler
that I know of) nothing more than an ordinary "global" function with a funny
name. So, calling a static member function does not require an instance of
the class, and there is no 'this' pointer visible within the body of a
static member function. You use static member functions to control the
visibility of functions that don't need a 'this' pointer.

-cd

convention, a module is
a .C or .CPP file,
the "implementation" of the thing. at
a time, it cannot definitions is
through "header files", but complete program
or the results are input
from one or more files recommendations about
how to divide C/C++


.
 
You can include a header file more than once ONLY if the header file is prepared to be
included more than once! Otherwise, you can have problems with duplicate declarations. The
concept of a header file "taking up space" is meaningless with respect to generated code,
since this is a concept that only applies to the compilation process. The compiler may or
may not ignore declarations it has seen before, because there are various rules about this
in the C and C++ languages and if you violate them you will get an error.

Note that a header file should include ONLY the function names that are exported in the
module. That is, if I do, in pure C:

int A() { return B(); }
static int B() { return 5; }

then a declaration of B would be totally and completely inappropriate in a header file! B
is not an exported interface.

However, it would always be valid to do
extern int A();
in a header file if functions from outside the module call A. If they do not, why was A
not declared static (in the C sense)?

However, it is always valid to do a forward declaration:

static int B();

inside the .c file, so you could write

static int B();
int A() { return B(); }
static int B() { return 5; }

The issue is definition vs. declaration. In the above three lines, the first line says
there will be a defined function B which takes 0 parameters and returns an int. The next
line calls it, and the call is valid because the name, its parameters, and return type are
all known. Finally, we define the function. This "forward declaration" technique is used
in nearly all languages.

Note that VB doesn't need it because VB doesn't care. Since most of the work is done by
interpretation at runtime, the decisions can be deferred. All other languages in history,
with few exceptions, require a forward declaration.

Note that 'static' in the C sense is NOT the same as 'static' in the C++ sense! 'static'
in the C sense is more akin to 'protected' or 'private' in the C++ sense.

In general, try as much as possible to avoid static member functions in C++. They have
very specialized utility and should only be used when appropriate. Probably 80% of the use
of static member functions I see is absolutely inappropriate.

A static member function does not require an object to exist in order to invoke the
method. It is the only way to handle various kinds of callback scenarios. It cannot
possibly use a 'this', since 'this' does not exist for a static member function, so the
quesiton of knowing whether or not it can never arises. It can't, ever. If you try to use
'this' either implicitly or explicitly, the compiler will give you an error.

Rule 1: Use of a static member function is a mistake
Rule 2: If you think you need a static member function, think again. You are probably
wrong
Rule 3: If, having thought about it, you still think you need a static member function,
you are probably still wrong. Think some more.
Rule 4: If, after careful thought, you realize that from a functional and structural
viewpoint, a static member function is what you need, there is a passably good chance you
are right.

Most uses I see of static variables and methods in C++ is an attempt to slam some
antiquared C-hack into a C++ environment, and the usage is usually wrong. Other than the
few restricted cases where they are necessary (primarily callbacks, including top-level
thread functions), they should be avoided as much as possible. Even then, the idea is to
get out of the C space and into the C++ space as quickly as possible (e.g., see my essay
on worker threads on my MVP Tips site, or my essay on callbacks in MFC).

Another valid use is when the purpose of the static method is to return a C++ object which
is often created for that purpose, e.g., CWnd::FromHandle is a static method. It typifies
another of the valid static method scenarios.

Note also that C# and Java do not use a "whole program" compilation model; there are ways
of handling separate compilation in C# and Java; in real systems these are commonly the
way the systems are built. In this case, the same issues arise.
joe


yes, that helps a lot thanks Carl.
So it doesn't really matter then if .h files get included
more than once and thus the same data getting sent to the
compiler from the preprocessor more than once, because
it's only a header file and shouldn't take up much space
anyway, or will it ignore definitions that it's already
seen before?
Also I'm not that clear on Static member functions. If I
want a class to have a static member function, how exactly
do I define it and what differences
(advantages/disadvantages) does it have over non-static
member functions? Is it that it can be called without
having a reference to an object of the class, and if so,
how will it know if it can use 'this' or not?


the "implementation" of the thing.

Joseph M. Newcomer [MVP]
email: (e-mail address removed)
Web: http://www.flounder.com
MVP Tips: http://www.flounder.com/mvp_tips.htm
 
Joseph said:
Note also that C# and Java do not use a "whole program" compilation
model; there are ways of handling separate compilation in C# and
Java; in real systems these are commonly the way the systems are
built. In this case, the same issues arise.
joe

From a language semantics standpoint, they very much do. The fact that the
compiler in most cases really does work a module at a time is an
implementation detail. The fact that Java & C# have no concept of
delcaration independent from definition means that the issues surrounding
header files in C/C++ simply are not relevant for those languages.

-cd
 
Not that I've ever seen. The using directive is much like #include. It doesn't generate
code, it just reads the definition information. The fact that this is identical to the
source which has been at some other time in the past, or will be at some future time, be
read by the compiler to generate actual code is a separate issue. Just because the
declarations are not disjoint from the definitions does not change the fact that you
specify the module. All that happens here is the "header" is the "implementation". But at
the point where you do a using, you can think of it as being only the header file. So it
is not a "whole program" approach. It is very much modular.
joe

From a language semantics standpoint, they very much do. The fact that the
compiler in most cases really does work a module at a time is an
implementation detail. The fact that Java & C# have no concept of
delcaration independent from definition means that the issues surrounding
header files in C/C++ simply are not relevant for those languages.

-cd

Joseph M. Newcomer [MVP]
email: (e-mail address removed)
Web: http://www.flounder.com
MVP Tips: http://www.flounder.com/mvp_tips.htm
 
The important difference between a C/C++ "header file" and a .NET namespace,
or a Java package is that using directives and import directives are always
idempotent and they have precisely defined language semantics that cause
defintions in one "module" to be visible in another "module". A header file
can accomplish the same in C/C++, but only as a result of careful
construction. If the header file uses include guards (to make it
idempotent) and contains only declarations (so as to not violate the ODR),
then an include file can serve as a form of module reference, but there's
nothing intrinsic in the compiler, the preprocessor or the language that
requires it to be so - it's completely up to the programmer to do it.

-cd
 
Joseph M. Newcomer said:
Not that I've ever seen. The using directive is much like #include. It doesn't generate
code, it just reads the definition information. The fact that this is identical to the
source which has been at some other time in the past, or will be at some future time, be
read by the compiler to generate actual code is a separate issue. Just because the
declarations are not disjoint from the definitions does not change the fact that you
specify the module. All that happens here is the "header" is the "implementation". But at
the point where you do a using, you can think of it as being only the header file. So it
is not a "whole program" approach. It is very much modular.

The using directive is more like linking with an external DLL. You don't
need it within a single module. In Java and C#, you can have two source
files like this:

// file 1
class A
{
B b;
};

// file 2
class B
{
A a;
};

and it will compile. Try pulling this trick in C++. If you say "forward
declarations", I'd say that's the whole point - Java and C# don't have
any and don't need any, because conceptually the module is compiled as a
whole, all source files are processed "simultaneously".
--
With best wishes,
Igor Tandetnik

"For every complex problem, there is a solution that is simple, neat,
and wrong." H.L. Mencken
 
A couple of additional points

If you're using VC++, it has a wonderful pragma to ensure that headers
don't get included more than once. In general it's a bad idea to
include headers multiple times, for the obvious problems already
mentioned (reduplication errors and warnings) and for certain less
obvious problems (statics).

The directive is
#pragma once
and you should just get in the habit of sticking it in the front of
any header file that is going to be used strictly by VC++ (custom msft
extension, won't work in other compilers)

To address the back and forth on whether duplicate headers will cause
more code to be generated possibly more than once. Straight C style
stuff is pretty safe, although the act of reading a header will affect
any preprocessor stuff, i.e. you can get the compiler complaining
about macro redefs. A very subtle problem is code bloat due to
templatization and inlines, in that the rules for generating template
instances are complex and can bite you where it hurts. Likewise, you
can get excessive inline expansion in certain cases. Doesn't sound
like you're using templates and inlines, but if you are, move them all
(defs, not decls) from .h to .inl and then set up an include
mechanism in your .cpp files like this
#if !INLINESVISIBLE
#include "..\inl\CGeometry.inl"
#endif

If you've told the compiler to outline inlines, mostly in debug mode,
then this isn't an issue, but if it's generating inlines or you're
using templates, this may be a problem, but not as bad as it used to
be. Places to be extra special aware are headers for classes
instantiating stl classes.

That said, your milage may vary, but as a rule of thumb it will help
keep you out of trouble :-)
 
Back
Top