Multicore systems and XP 32/Longhorn - XP x64

  • Thread starter Thread starter Ioannis Vranos
  • Start date Start date
I

Ioannis Vranos

Since multicore processors are about to become mainstream soon,
multithreading will become a main concern too.

However I am thinking that perhaps for small/medium-sized applications
multithreading optimisation should not be a major concern apart from the
cases where it makes sense (for example a downloading application where
one thread performs the network connection and download and a separate
thread updates the data information: download speed, how much has been
downloaded, remaining time etc.

Windows on the other hand would assign different processes
(applications) on different processor cores by its own for more
efficient processor core utilisation.


My question is this. Do Windows XP32/Longhorn - XP x64 assign different
processes to different processors in multiprocessor systems (and I
assume this applies also to multicore CPUs), or not?
 
Ioannis Vranos said:
My question is this. Do Windows XP32/Longhorn - XP x64 assign different
processes to different processors in multiprocessor systems (and I assume
this applies also to multicore CPUs), or not?

You'd probably be wise to post again in the kernel group:

microsoft.public.win32.programmer.kernel

That said, the scheduler's unit of work for Win32 operating systems is a
thread, not a process. Now every process can have its "affinity" (natural
attraction) for a set of processors set with SetProcessorAffinityMask().
Threads as well can be drawn to processors with SetThreadAffinityMask()
where this make is a subset of the owning process' mask.

As far as I know, and I might be mistaken which is why you should post again
in the kernel group, processes and threads by _default_ have an affinity for
each and every processor in the system.

Regards,
Will
 
William said:
You'd probably be wise to post again in the kernel group:

microsoft.public.win32.programmer.kernel

That said, the scheduler's unit of work for Win32 operating systems
is a thread, not a process. Now every process can have its "affinity"
(natural attraction) for a set of processors set with
SetProcessorAffinityMask(). Threads as well can be drawn to
processors with SetThreadAffinityMask() where this make is a subset
of the owning process' mask.
As far as I know, and I might be mistaken which is why you should
post again in the kernel group, processes and threads by _default_
have an affinity for each and every processor in the system.

Yes, they do. Each thread also has a "preferred processor" and will be
scheduled on that processor, if possible, whenever it's ready to run. The
preferred processor starts out at 0 when the process is created and is
incremented modulo the number of processors every time a thread is created.
Win XP and later is also aware of the cache affinity between virtual
processors with hyperthreading turned on, and will use that information to
try to keep a thread within a single physical CPU if it has a choice.

-cd
 
Carl said:
Yes, they do. Each thread also has a "preferred processor" and will be
scheduled on that processor, if possible, whenever it's ready to run. The
preferred processor starts out at 0 when the process is created and is
incremented modulo the number of processors every time a thread is created.
Win XP and later is also aware of the cache affinity between virtual
processors with hyperthreading turned on, and will use that information to
try to keep a thread within a single physical CPU if it has a choice.


That's nice. So if we have two separate .NET applications running on a
multiprocessor/multicore system, may we assume that they will both run
on different processors/cores if there are available?


Or is there the possibility that both will run on one, and the other one
will be sitting idle?
 
Ioannis said:
That's nice. So if we have two separate .NET applications running on a
multiprocessor/multicore system, may we assume that they will both run
on different processors/cores if there are available?

IIUC, the kernel tries to do that, but there's no guarantee of how
successful it'll be.
Or is there the possibility that both will run on one, and the other
one will be sitting idle?

Entirely possible, but only for brief periods I would think. A thread will
never be switched to another processor unless it's used up it's quantum or
blocked, so if other processes had the other processor(s) busy, both of your
processes might end up on the same CPU.

-cd
 
Carl said:
Entirely possible, but only for brief periods I would think. A thread will
never be switched to another processor unless it's used up it's quantum or
blocked, so if other processes had the other processor(s) busy, both of your
processes might end up on the same CPU.


OK, thanks all for the insight. :-) So what do you think of my text:

"for small/medium-sized [.NET] applications multithreading optimisation
should not be a major concern apart from the cases where it makes sense
(for example a downloading application where one thread performs the
network connection and download and a separate thread updates the data
information: download speed, how much has been downloaded, remaining
time etc.

Windows on the other hand would assign different processes
(applications) on different processor cores by its own for more
efficient processor core utilisation."
 
Ioannis said:
Carl said:
Entirely possible, but only for brief periods I would think. A thread
will never be switched to another processor unless it's used up it's
quantum or blocked, so if other processes had the other processor(s)
busy, both of your processes might end up on the same CPU.



OK, thanks all for the insight. :-) So what do you think of my text:

"for small/medium-sized [.NET] applications multithreading optimisation
should not be a major concern apart from the cases where it makes sense
(for example a downloading application where one thread performs the
network connection and download and a separate thread updates the data
information: download speed, how much has been downloaded, remaining
time etc.

Windows on the other hand would assign different processes
(applications) on different processor cores by its own for more
efficient processor core utilisation."
That is not *really* the way it works. The Windows scheduler is
basically completely oblivious to processes, it only knows about threads
and will run a thread on the best possible (potentially hyperthereaded,
virtual) core. What process a thread belongs to really plays no role. If
every process only has a single thread (definitely not true for .NET
code since the engine itself starts up severa for tasks like running the
finalizers) then what you stated does more or less equate to what happens.

Ronald
 
Ronald said:
That is not *really* the way it works. The Windows scheduler is
basically completely oblivious to processes, it only knows about threads
and will run a thread on the best possible (potentially hyperthereaded,
virtual) core. What process a thread belongs to really plays no role. If
every process only has a single thread (definitely not true for .NET
code since the engine itself starts up severa for tasks like running the
finalizers) then what you stated does more or less equate to what happens.


OK, thanks a lot.
 
OK, thanks all for the insight. :-) So what do you think of my text:
"for small/medium-sized [.NET] applications multithreading
optimisation should not be a major concern apart from the cases
where it makes sense (for example a downloading application where
one thread performs the network connection and download and a
separate thread updates the data information: download speed, how
much has been downloaded, remaining time etc.

Windows on the other hand would assign different processes
(applications) on different processor cores by its own for more
efficient processor core utilisation."

This is not really the way Windows scheduler works : First, what is assigned
to one processor or another is a thread, not a process - and there is always
several thread on a .NET process (since the framework starts some worker
threads by itself).
Second, once started, a thread may be switched at about any time from one
(virtual) processor to another. Therefore, a thread does not run on *one*
processor : rather each of it's timeslices is scheduled separately on one of
the available processors. The scheduler algorithm tries to keep one thread
always on the same processor (to optimize locality), but it doesn't always
success to do so because of other threads activity on the system. The
scheduler of XP and 2003 is aware of hyperthread processors, so it will try
to keep one thread on the same physical processor, even if it must switch it
from one virtual processor to another.

Concerning multicore, I don't think the scheduler is aware of it now (btw,
the multicore models from AMD and Intel are quite different, since AMD as
the memory controller on the chip, whereas Intel has not - I presume that
would mean different optimizations in the scheduler). But event without
those optimizations, the scheduler is quite capable of handling efficiently
multicore systems.

Read "Microsoft Window Internals" for more details on this subject.

What is certain however, is that the switch to
hyperthread/multicore/multiprocessor systems will definitely change the way
software developpers can take advantages of the constant growth of
processors power : see http://www.gotw.ca/publications/concurrency-ddj.htm
for an interesting discussion on the subject.

Arnaud
MVP - VC
 
Arnaud said:
What is certain however, is that the switch to
hyperthread/multicore/multiprocessor systems will definitely change the way
software developpers can take advantages of the constant growth of
processors power : see http://www.gotw.ca/publications/concurrency-ddj.htm
for an interesting discussion on the subject.


Yes, however if Windows share the various programs on different cores, I
think the multithreading optimisation "frenzy" can be restricted to
large applications domain only.
 
Ioannis said:
Yes, however if Windows share the various programs on different cores, I
think the multithreading optimisation "frenzy" can be restricted to
large applications domain only.


or better termed, to heavy-processing applications only.
 
or better termed, to heavy-processing applications only.

To some degree. However, almost every process has multiple threads
(though a very simple process will have one "original" thread that
just sits around waiting for the "real" thread to finish.

But using some libraries, or even common controls, creates additional
threads. Some examples: multimedia, Open File dialog, ODBC and other
database access, etc. In some cases, these extra threads are for
convenience; in other cases, they make processing quicker on
multiprocessor/multicore systems.

My main application creates background threads specifically for
long-running tasks; these are useful on all types of systems, even
single-processor systems, as they allow work to be done in one thread
while others are waiting for I/O, without requiring Windows-specific
asynchronous I/O and the voluminous housekeeping involved.
 
Ioannis said:
or better termed, to heavy-processing applications only.
This is true if there are at most a handful cores. But some models Intel
and other CPU desigmners have been talking about have 100+ cores on a
die within the next 10-12 years. If that is the case, to keep scaling,
even individual applications will need to take advantage. Even apps that
aren't normally compute intensive tend to have spurts of time where they
are.

Ronald
 
Ronald said:
This is true if there are at most a handful cores. But some models Intel
and other CPU desigmners have been talking about have 100+ cores on a
die within the next 10-12 years. If that is the case, to keep scaling,
even individual applications will need to take advantage. Even apps that
aren't normally compute intensive tend to have spurts of time where they
are.


In such a system, "100" light single-threaded applications would be
assigned to different core each, so we have a scaling.
 
Ioannis said:
In such a system, "100" light single-threaded applications would be
assigned to different core each, so we have a scaling.
Luckily I don't have that many applications running on my system. Ad if
I want to compile my program, it better have a compielr that uses all of
my CPU effectively, not just 1% of it.
 
Ronald said:
Luckily I don't have that many applications running on my system. Ad if
I want to compile my program, it better have a compielr that uses all of
my CPU effectively, not just 1% of it.


Well, I suppose it will be impossible to optimise a light application to
take advantage of 100 CPUs at the same time, the thread lock acquiring
and releasing would multiply the run-time/space overhead with no
performance gain in essence.


I think in these situations (light/medium CPU intensive applications)
the OMP standard (http://www.openmp.org) which is "non-invasive", will
be providing the needed multithreading when it is needed, while the
thread lock will make sense for CPU heavy applications.


The bottom line is, since for now we have reached nearly the end of
clock-speed enhancements, do not expect the same amount of speed-gains
even if you use thread-lock multithreading to make simple additions. :-)
 
Ioannis said:
Well, I suppose it will be impossible to optimise a light application
to take advantage of 100 CPUs at the same time, the thread lock
acquiring and releasing would multiply the run-time/space overhead
with no performance gain in essence.


I think in these situations (light/medium CPU intensive applications)
the OMP standard (http://www.openmp.org) which is "non-invasive", will


... which no doubt explains (partly) why VC 2005 supports Open MP.
be providing the needed multithreading when it is needed, while the
thread lock will make sense for CPU heavy applications.

.... which is why there's so much research going on in the design and theory
of lock-free data structures.

-cd
 
Back
Top