Please Stop It Rebuilding Everything!

  • Thread starter Thread starter Andy Capon
  • Start date Start date
A

Andy Capon

Hi There,

We have a medium size project about 2000 source files and 700,000
lines of code, as you can imagine this takes some time to rebuild all.

Now our problem is that we have a code generator we have developed
that updates some of our classes, as an example lets say I update 50
files. When we do a build the ide builds nearly all of the source
files while we swear and then twiddle our thumbs for a couple of
hours.

Its almost as if it says "well alot of files have changed I will just
rebuild everything".

Does anyone know why this happens?

We are useing 2003 but 2002 did it as well.

As a side point I have noticed that the upcomeing version can use
multiple processors to build, will this work within one project as
then we could at least halve the time. We can build it on uour 24
processor irix box in about 10 minutes!

Any help would be much apreciated.

Well back to twidling my thumbs.

Cheers

Andy
 
Andy said:
Hi There,

We have a medium size project about 2000 source files and 700,000
lines of code, as you can imagine this takes some time to rebuild all.

Now our problem is that we have a code generator we have developed
that updates some of our classes, as an example lets say I update 50
files. When we do a build the ide builds nearly all of the source
files while we swear and then twiddle our thumbs for a couple of
hours.

Its almost as if it says "well alot of files have changed I will just
rebuild everything".

Does anyone know why this happens?

Have you analyzed the dependencies in your code to demonstrate that
significantly fewer files should be compiled? When you update classes with
the code generator, are unchanged files given a new timestamp? If so, that
could be a huge source of unnecessary re-work by the compiler.

Are you making proper use of precompiled headers? I'd expect a 700,000 line
program to build in 10 minutes tops, even on a single processor machine if
precompiled headers are used properly and consistently.
We are useing 2003 but 2002 did it as well.

I know there have been reports of VS always rebuilding everything, but I
don't know any of the details in those cases - hopefully someone else that
does know will reply as well.
As a side point I have noticed that the upcomeing version can use
multiple processors to build, will this work within one project as
then we could at least halve the time.

No, it won't help you. The MP support in Whidbey works at the project
level, so if you have 2000 source files in one project, they'll all be built
sequentially. If you can restructure your code into several projects
(libraries, I'd assume), then you could get some parallel compilation
benefit.

-cd
 
Hi Carl,

Thanks for your reply see remarks below:

Carl Daniel said:
Have you analyzed the dependencies in your code to demonstrate that
significantly fewer files should be compiled? When you update classes with
the code generator, are unchanged files given a new timestamp? If so, that
could be a huge source of unnecessary re-work by the compiler.

I am definate of the dependencies affected a smaller amount of files,
unchanged files remain totally unchanged by the code-gernerator and are
read-only as they are in sourcesafe!

Are you making proper use of precompiled headers? I'd expect a 700,000 line
program to build in 10 minutes tops, even on a single processor machine if
precompiled headers are used properly and consistently.

We are useing precompiled headers but maybe not properly, could you direct
me to some info on how to use them properly! If we could get this to build in
10 minutes you would be a hero!

I know there have been reports of VS always rebuilding everything, but I
don't know any of the details in those cases - hopefully someone else that
does know will reply as well.


No, it won't help you. The MP support in Whidbey works at the project
level, so if you have 2000 source files in one project, they'll all be built
sequentially. If you can restructure your code into several projects
(libraries, I'd assume), then you could get some parallel compilation
benefit.

Well thats a shame, so not really any different to loading two versions of
the ide with different projects and building in both then!
 
Hi Andy,
Well thats a shame, so not really any different to loading two versions of
the ide with different projects and building in both then!
No, it isn't since we do track dependencies between projects. For future
versions we will enable intra-process parallellism. We did work on
concurrent PDB access for this release and that was the biggest blocker.

However, the fastest way is going to stay using multiple proc support for
building individual projects. The concurrent access to teh PDB does incur
serialisation overhead that you otherwise avoid. This is the way we
structure all our builds at Microsoft.

For precompiled headers the few top level tips are. Do NOT use /YX, do use
/Yc/Yu. Do make sure you have as much as possible of shared dependencies in
a set of .pch files, do make sure you don't create one mongo .pch file idf
most of your .cpp files only depend on a small part of it.

Try compiling stable code into libraries and linking them in.

If you determine that there is an issue with dependency checking, we can
definitely track that down if you can provide a reasonable repro. If you
cannot, then you can call our product support services and work with them
(and get re-imbursed for the support incident if the cause is indeed a bug
in the product, if you don't already have free incidents from buying the
product or MSDN).

Ronald Laeremans
Visual C++ team
 
We have seen a lot of "always rebuild" problems throughout the years.
Two most common causes:
1. extra unneeded dependencies specified in source code
2. a bug in our dependency scanner for .rc files where we did not pre-
process the file and tried to get every single include (even those
#ifdef'ed out) and recompiling the .rc file if any one of them was not
found.

The first we can't really fix since it's the developer's source code
that needs to be fixed :) The second we fixed in VS2005 (and may even
have a hotfix for for earlier releases but I'm not certain).

Does your code belong to either category?

If not and you really suspect it is a bug in our dependency scanner,
then you should call product support to get a fix for it as Ronald
suggests in another response.

Thanks,
 
Hello,

As the guys from MS have said, a 700,000 lines of code is actually not very
much at all. I can rebuild a much larger suite of projects from scratch in
about 40 minutes on a reasonable, but not super high-end, machine. Single
processor, hyper-threaded, RAID 1 drive (redundant). So you might simply
consider getting a machine that has a fast drive. I found that with this
RAID 1 setup, things became much, much faster. With RAID 0 (striping) it
may be even faster still. (Perhaps have I have RAID 0 and 1 backwards here,
I can never remenber, I have the redundant drive version).

However, you will get a huge increase in speed from using precompiled
headers correctly, as one of the posters mentioned. I think we probably got
a 50-1 improvement in time building some projects from using precompiled
headers.

We don't have any projects with 2000 source files (50-100 is largest for
us), so your use of precompiled headers may need to be slightly different
than ours, but I can give you a snapshot of how we use it.

In each project that warrents it, we create a file "precomp.h" into which we
place all of the windows and system headers (crt / stdc++ etc.) that are
reasonable for the project. We also place headers in there from our own
suite of libraries that are being used in the project being built. If
appropriate, some project local headers may also be placed there, but only
if they are never (or very very rarely) changing.

Create a file "precomp.cpp" that has one line of text - #include
"precomp.h". Add this file to your project.

In every single one of your files being compiled (i.e. all of your ".cpp"
files), insert #include "precomp.h" as the very first non-comment text line.

Change your project settings so that "Use precompiled headers (/Yu)" is
selected as the default. Indicate that "precomp.h" is the header to use.

Change then the file specific setting for "precomp.cpp" to be "Create
precompiled header (/Yc)". Again specify "precomp.h" as the appropriate
header.

That should be it. It might be painful to inser that #include "precomp.h"
in every one of your source files, but it is a one time process. For a
project with 2000 files it may be appropriate to have a number of different
precompiled headers, but these .pch files are quite large and if the
compiler has to switch from using one to using another between different
translation units, that may slow it down because of disk access. Once the
..pch file is loaded and in the disk cache, it is very fast retrieving it,
which is a big win.

I would also though seriously consider breaking up your project into pieces,
as 2000 source files in one project seems excessive. If you look at large
library suites that people have out there (GDAL comes to mind) you will see
I think that mostly things are broken up into more managable sized
collections of files per project (.dll or .lib).

Anyway bottom line is I would look first to these other solutions than
assuming that the dependency walker of DevStudio is malfunctioning.
Whenever I have thought there was something going wrong with it in the past,
I have come to realize that there was some dependency I wasn't considering.

Good luck.

-Eric Twietmeyer

PS However, I do have a gripe about CSharp projects. They do always rebuild
as far as I can tell, there doesn't seem to be a notion for these projects
that all targets are up to date and don't need to be touched. They simply
always rebuild completely. Don't know why....
 
Eric said:
PS However, I do have a gripe about CSharp projects. They do always
rebuild as far as I can tell, there doesn't seem to be a notion for
these projects that all targets are up to date and don't need to be
touched. They simply always rebuild completely. Don't know why....

Think about the dependency analysis when there are no .OBJ files...

I recall hearing/reading that the C# compiler does do some kind of partial
rebuild - But it seems that's either not the case, or it's not very
effective.

-cd
 
Hi everyone,

Thanks for your replies, and from what I see we don't seem to be doing
anything wrong.

First of all about total build times :

We use precompiled headers for all headers that do not change often,
basically windows.h, all the stl headers we use and some includes from us.

We use /Yu, and if anything in the pre-compiled header changes we manually
rebuild it by setting to /Yc rebuilding one file to create the precompiled
header and then switching back to /Yu to build the rest.

We have fast machines with raid 0 across three ultra320 15K disks and dual
xeon processors.


Now concerning problems with dependencies:

I can reproduce this problem here easily and it is not the .rc problem that
Tarek mentioned as it is a console app, also it is not uneeded dependencies
in the source code. Totally unrelated files are built!

Also lets say I regenerate 50 of our classes, mostly it will rebuild these
50 and only the other dependent classes lets say 75% of the time, the other
25% of the time it will rebuild nearly everything.

If I only regenerate 10 classes then it will only rebuild everything very
rairly, say 5% of the time.

If I regenerate a large number of classes say 1000 it will nearly always
rebuild everything.

So to us it looks like the more files that are changed the higher % chance
of it rebuilding everything.

I think I will take the advice of creating library projects to see if that
will help us.

Thanks for all your replies

Andy
 
I would really encourage you to call MS support to help them diagnose this
issue and get you a fix if it turns out to be a product bug.

Trying this with the Beta of Visual C++ 2005 would also be a great data
point.

Thanks.

Ronald
 
Back
Top